I have seen a lot of posts on X and LinkedIn on the importance of Agentic AI memory. What exactly is it ? Is it just another name for RAG ? Why is it different from any other application memory ? In this blog, I try to answer these questions.
What do people mean when they say "AI Memory" ?
There is no persistent memory inside LLMs (unless you fine-tune or train). Everything called “memory” today is external.
At their core LLMs are stateless functions. You make a request with a prompt and some context data and it provides you with a response
In real systems, AI memory usually means:
- Storing past interactions, user preferences, decisions, goals, or facts.
- Retrieving relevant parts later
- Feeding a compressed version back into the prompt
So yes — at its core:
Memory = save → retrieve → summarize → inject into context
Nothing magical.
Is this just RAG (Retrieval Augmented Generation) ?
They are related but not the same.
RAG (Retrieval Augmented Generation)
Purpose:
- Bring external knowledge into the LLM
- Docs, PDFs, financial data, code, policies
Typical traits:
- Stateless
- Large text chunks
- Query-driven retrieval
- “What additional data can we provide to LLM to help answer this question?”
Agent / User Memory
Purpose:
- Maintain continuity
- Personalization
- Learning user intent and preferences over time
Typical traits:
- Long-lived
- Highly structured
- Small, distilled facts
- “What can I provide to LLM so it remembers this user?”
Think of it this way:
They often use can use the same retrieval tools, but they serve different roles.
Where is the memory ?
Option 1: Agent process memory
Any suitable data structure like a HashMap.
Suitable for cases where the Agent loop is short and no persistence is needed.
Suitable for cases where the Agent loop is short and no persistence is needed.
Option 2: Redis /Cache
Suitable for session info, recent conversation history, tool results cache, temporary state.
.
Option 3: PostgreSQL/RDBMS
Suitable when you need durability, auditability, explainablilty.
Option 4: Vector databases
Suitable for semantic search.
Option 5: AI memory tools
Such as LanGraph memory, LLamaIndex memory, Memgpt. They try to make it easier for agents to store and retrieve.
Such as LanGraph memory, LLamaIndex memory, Memgpt. They try to make it easier for agents to store and retrieve.
Here is example of data that might be stored in memory:
{
"user_id": "123",
"fact": "User prefers concise python code",
"source": "conversation_turn_5",
"timestamp": "2026-02-09"
}
The mental model for AI memory
Short term memory
This is about recent interactions. Relevant to may the current topic being discussed. For example, the user prefers conservative answers.
Long term memory
This is stored externally, perhaps even to persistent storage. It is retrieved and inserted into context selectively. For example, the user is a vegetarian or the user's risk tolerance is low.
Memory and the LLM
The LLM takes as input only messages. Agent has to read the data from memory and insert it into the text message. This is what they refer to as context.
You do not want add large amount of arbitrary data as context because:
You do not want add large amount of arbitrary data as context because:
- text is converted to token and token cost spirals
- LLM attention degrades with noise
- Latency increases
- Reasoning quality declines
Considerations
Memory != Raw History
Bad Use : Here are the last 47 conversations ......
Better Use : We were talking about my retirement goals with this income and number of years to retire.
Summarize and abstract to extract intelligence - opposed to dumping large quantity of date.
Summarize and abstract to extract intelligence - opposed to dumping large quantity of date.
In conclusion
AI memory is structured state, sometimes summarized that is retrieved when needed and included as LLM input as "context".
Conceptually, it is similar to RAG but they apply to different use cases.
Better and smaller contexts beat large contexts and large memory.
Agentic AI Memory adds value only when
- The system changes behavior ( for the better ) because of it
- It produces better response, explanations, reasonings
- It saves time

No comments:
Post a Comment