Monday, February 9, 2026

What is (Agentic) AI Memory ?

I have seen a lot of posts on X and LinkedIn on the importance of Agentic AI memory.  What exactly is it ? Is it just another name for RAG ? Why is it different from any other application memory ? In this blog, I try to answer these questions.

What do people mean when they say "AI Memory" ?


There is no persistent memory inside LLMs (unless you fine-tune or train). Everything called “memory” today is external.

At their core LLMs are stateless functions. You make a request with a prompt and some context data and it provides you with a response

In real systems, AI memory usually means:

  • Storing past interactions, user preferences, decisions, goals, or facts.
  • Retrieving relevant parts later
  • Feeding a compressed version back into the prompt

So yes — at its core:

Memory = save → retrieve → summarize → inject into context

Nothing magical.

Is this just RAG (Retrieval Augmented Generation) ?


They are related but not the same.
RAG (Retrieval Augmented Generation)

Purpose:

  • Bring external knowledge into the LLM
  • Docs, PDFs, financial data, code, policies

Typical traits:

  • Stateless
  • Large text chunks
  • Query-driven retrieval
  • “What additional data can we provide to LLM  to help answer this question?”

Agent / User Memory


Purpose:
  • Maintain continuity
  • Personalization
  • Learning user intent and preferences over time

Typical traits:

  • Long-lived
  • Highly structured
  • Small, distilled facts
  • “What can I provide to LLM so it remembers this user?”

Think of it this way:

They often use can use the same retrieval tools, but they serve different roles.

Where is the memory ?


Option 1: Agent process memory

Any suitable data structure like a HashMap.
Suitable for cases where the Agent loop is short and no persistence is needed.

Option 2: Redis /Cache

Suitable for session info, recent conversation history, tool results cache, temporary state.
.
Option 3: PostgreSQL/RDBMS

Suitable when you need durability, auditability, explainablilty.

Option 4: Vector databases

Suitable for semantic search.

Option 5: AI memory tools

Such as LanGraph memory, LLamaIndex memory, Memgpt. They try to make it easier for agents to store and retrieve.

Here is example of data that might be stored in memory:

{
  "user_id": "123",
  "fact": "User prefers concise python code",
  "source": "conversation_turn_5",
  "timestamp": "2026-02-09"
}

The mental model for AI memory


Short term memory

This is about recent interactions. Relevant to may the current topic being discussed. For example, the user prefers conservative answers.

Long term memory

This is stored externally, perhaps even to persistent storage. It is retrieved and inserted into context selectively. For example, the user is a vegetarian or the user's risk tolerance is low.

Memory and the LLM


The LLM takes as input only messages. Agent has to read the data from memory and insert it into the text message. This is what they refer to as context.

You do not want add large amount of arbitrary data as context because:
  • text is converted to token and token cost spirals
  • LLM attention degrades with noise
  • Latency increases
  • Reasoning quality declines

Considerations


Memory != Raw History
Bad Use : Here are the last 47 conversations ......
Better Use : We were talking about my retirement goals with this income and number of years to retire.

Summarize and abstract to extract intelligence - opposed to dumping large quantity of date. 

In conclusion


AI memory is structured state, sometimes summarized that is retrieved when needed and included as LLM input as "context".

Conceptually, it is similar to RAG but they apply to different use cases.

Better and smaller contexts beat large contexts and large memory.

Agentic AI Memory adds value only when

  • The system changes behavior ( for the better ) because of it
  • It produces better response, explanations, reasonings
  • It saves time