What do people mean when they say "AI Memory" ?
At their core LLMs are stateless functions. You make a request with a prompt and some context data and it provides you with a response
In real systems, AI memory usually means:
- Storing past interactions, user preferences, decisions, goals, or facts.
- Retrieving relevant parts later
- Feeding a compressed version back into the prompt
So yes — at its core:
Memory = save → retrieve → summarize → inject into context
Nothing magical. But is that all ? seems just like a regular cache ? Read on.
Is this just RAG (Retrieval Augmented Generation) ?
Purpose:
- Bring external knowledge into the LLM
- Docs, PDFs, financial data, code, policies
Typical traits:
- Stateless
- Large text chunks
- Query-driven retrieval
- “What additional data can we provide to LLM to help answer this question?”
Agent / User Memory
- Maintain continuity
- Personalization
- Learning user intent and preferences over time
Typical traits:
- Long-lived
- Highly structured
- Small, distilled facts
- “What can I provide to LLM so it remembers this user?”
Think of it this way:
They often use can use the same retrieval tools, but they serve different roles.
Where is the memory ?
Suitable for cases where the Agent loop is short and no persistence is needed.
Such as LanGraph memory, LLamaIndex memory, Memgpt. They try to make it easier for agents to store and retrieve.
The mental model for AI memory
Memory and the LLM
You do not want add large amount of arbitrary data as context because:
- text is converted to token and token cost spirals
- LLM attention degrades with noise
- Latency increases
- Reasoning quality declines
Real Agentic Memory
To be useful in the agentic way, what is stored in the memory needs to evolve. Older or maybe irrelevant data in the memory needed to be "forgotten" or evicted based on intelligence (not standard algorithms like FIFO, LIFO etc). Updates and evictions need to happen based on recent interactions. If the historical information is too long and should not be evicted, it might need to be compressed.
Unlike regular SAAS where memory can be considered "static", agentic memory has be more dynamic. In the case of long running agents, the quality of data in the memory has to get better with interactions over time.
How exactly that can be implemented in beyond the scope of this blog and could be a topic for a future one.
Considerations
Summarize and abstract to extract intelligence - opposed to dumping large quantity of date.
In conclusion
AI memory is structured state, sometimes summarized that is retrieved when needed and included as LLM input as "context".
Conceptually, it is similar to RAG but they apply to different use cases.
Better and smaller contexts beat large contexts and large memory.
Agentic AI Memory adds value only when
- The system changes behavior ( for the better ) because of it
- It produces better response, explanations, reasonings
- It saves time


.jpg)
.jpg)


