Code & Cluster: What is (Agentic) AI Memory ?

I have seen a lot of posts on X and LinkedIn on the importance of Agentic AI memory. What exactly is it ? Is it just another name for RAG ? Why is it different from any other application memory ? In this blog, I try to answer these questions.

What do people mean when they say "AI Memory" ?

Most production LLM interactions rely on external memory systems. Everything called “memory” today is mostly external.

At their core LLMs are stateless functions. You make a request with a prompt and some context data and it provides you with a response

In real systems, AI memory usually means:

Storing past interactions, user preferences, decisions, goals, or facts.
Retrieving relevant parts later
Feeding a compressed version back into the prompt

So yes — at its core:

Memory = save → retrieve → summarize → inject into context

Nothing magical. But is that all ? seems just like a regular cache ? Read on.

Is this just RAG (Retrieval Augmented Generation) ?

They are related but not the same.

RAG (Retrieval Augmented Generation)

Purpose:

Bring external knowledge into the LLM
Docs, PDFs, financial data, code, policies

Typical traits:

retrieval is stateless per query
Large text chunks
Query-driven retrieval
“What additional data can we provide to LLM to help answer this question?”

Agent / User Memory

Purpose:

Maintain continuity
Personalization
Learning user intent and preferences over time

Typical traits:

Long-lived
Highly structured
Small, distilled facts
“What can I provide to LLM so it remembers this user?”

Think of it this way:

They often use can use the same retrieval tools, but they serve different roles.

Where is the memory ?

Option 1: Agent process memory

Any suitable data structure like a HashMap.
Suitable for cases where the Agent loop is short and no persistence is needed.

Option 2: Redis /Cache

Suitable for session info, recent conversation history, tool results cache, temporary state.

Option 3: PostgreSQL/RDBMS

Suitable when you need durability, auditability, explainability.

Option 4: Vector databases

Suitable for semantic search.

Option 5: AI memory tools

Such as LangGraph memory, LlamaIndex memory, Memgpt. They try to make it easier for agents to store and retrieve.

Here is example of data that might be stored in memory:

{

"user_id": "123",

"fact": "User prefers concise python code",

"source": "conversation_turn_5",

"timestamp": "2026-02-09"

}

The mental model for AI memory

Short term memory

This is about recent interactions. It is data relevant to the current topic being discussed. For example, the user prefers conservative answers.

Long term memory

This is stored externally, perhaps even to persistent storage. It is retrieved and inserted into context selectively. For example, the user is a vegetarian or the user's risk tolerance is low.

Memory and the LLM

The LLM takes as input only messages. Agent has to read the data from memory and insert it into the text message. This is what they refer to as context.

You do not want add large amount of arbitrary data as context because:

text is converted to token and token cost spirals
LLM attention degrades with noise
Latency increases
Reasoning quality declines

Real Agentic Memory

At the start of the blog, I asked "is this just a regular cache ?".

To be useful in the agentic way, what is stored in the memory needs to evolve. Older or maybe irrelevant data in the memory needed to be "forgotten" or evicted based on intelligence (not standard algorithms like FIFO, LIFO etc). Updates and evictions need to happen based on recent interactions. If the historical information is too long and should not be evicted, it might need to be compressed.

Agentic systems require more dynamic memory evolution than typical CRUD applications. In the case of long running agents, the quality of data in the memory has to get better with interactions over time.

How exactly that can be implemented is beyond the scope of this blog and could be a topic for a future one.

Considerations

Memory != Raw History

Bad Use : Here are the last 47 conversations ......

Better Use : We were talking about my retirement goals with this income and number of years to retire.

Summarize and abstract to extract intelligence - as opposed to dumping large quantity of data.

In conclusion

AI memory is structured state, sometimes summarized that is retrieved when needed and included as LLM input as "context".

Conceptually, it is similar to RAG but they apply to different use cases.

Better and smaller contexts beat large contexts and large memory.

Agentic AI Memory adds value only when

The system changes behavior ( for the better ) because of it
It produces better response, explanations, reasonings
It saves time

These ideas are not purely theoretical. While building Vestra — an AI agent focused on personal financial planning and modeling — I’ve had to think deeply about what should be remembered, what should be abstracted, and what should be discarded. In financial reasoning especially, raw history is far less useful than structured, evolving state.

But yes, Agentic memory will be different than what we know as memory in regular apps — in the ways it is updated, evicted, and retrieved.

Code & Cluster

Monday, February 9, 2026

What is (Agentic) AI Memory ?