Thursday, April 9, 2026
Why Deterministic AI Agents Are The Wrong Goal ?
Monday, February 9, 2026
What is (Agentic) AI Memory ?
What do people mean when they say "AI Memory" ?
At their core LLMs are stateless functions. You make a request with a prompt and some context data and it provides you with a response
In real systems, AI memory usually means:
- Storing past interactions, user preferences, decisions, goals, or facts.
- Retrieving relevant parts later
- Feeding a compressed version back into the prompt
So yes — at its core:
Memory = save → retrieve → summarize → inject into context
Nothing magical. But is that all ? seems just like a regular cache ? Read on.
Is this just RAG (Retrieval Augmented Generation) ?
Purpose:
- Bring external knowledge into the LLM
- Docs, PDFs, financial data, code, policies
Typical traits:
- retrieval is stateless per query
- Large text chunks
- Query-driven retrieval
- “What additional data can we provide to LLM to help answer this question?”
Agent / User Memory
- Maintain continuity
- Personalization
- Learning user intent and preferences over time
Typical traits:
- Long-lived
- Highly structured
- Small, distilled facts
- “What can I provide to LLM so it remembers this user?”
Think of it this way:
They often use can use the same retrieval tools, but they serve different roles.
Where is the memory ?
Suitable for cases where the Agent loop is short and no persistence is needed.
Such as LangGraph memory, LlamaIndex memory, Memgpt. They try to make it easier for agents to store and retrieve.
The mental model for AI memory
Memory and the LLM
You do not want add large amount of arbitrary data as context because:
- text is converted to token and token cost spirals
- LLM attention degrades with noise
- Latency increases
- Reasoning quality declines
Real Agentic Memory
To be useful in the agentic way, what is stored in the memory needs to evolve. Older or maybe irrelevant data in the memory needed to be "forgotten" or evicted based on intelligence (not standard algorithms like FIFO, LIFO etc). Updates and evictions need to happen based on recent interactions. If the historical information is too long and should not be evicted, it might need to be compressed.
Agentic systems require more dynamic memory evolution than typical CRUD applications. In the case of long running agents, the quality of data in the memory has to get better with interactions over time.
How exactly that can be implemented is beyond the scope of this blog and could be a topic for a future one.
Considerations
Summarize and abstract to extract intelligence - as opposed to dumping large quantity of data.
In conclusion
Conceptually, it is similar to RAG but they apply to different use cases.
Better and smaller contexts beat large contexts and large memory.
Agentic AI Memory adds value only when
- The system changes behavior ( for the better ) because of it
- It produces better response, explanations, reasonings
- It saves time
These ideas are not purely theoretical. While building Vestra — an AI agent focused on personal financial planning and modeling — I’ve had to think deeply about what should be remembered, what should be abstracted, and what should be discarded. In financial reasoning especially, raw history is far less useful than structured, evolving state.
But yes, Agentic memory will be different than what we know as memory in regular apps — in the ways it is updated, evicted, and retrieved.
Saturday, September 13, 2025
What Does Adding AI To Your Product Even Mean?
Introduction
I have been asked this question multiple times: My management sent out a directive to all teams to add AI to the product. But I have no idea what that means ?
In this blog I discuss what adding AI actually entails, moving beyond the hype to practical applications and what are some things you might try.
At its core, adding AI to a product means using an AI model, either the more popular large language model (LLM) or a traditional ML model to either
- predict answers
- generate new data - text, image , audio etc
The effect of that is it enable the product to
- do a better job of responding to queries
- automate repetitive tasks
- personalize responses
- extract insights
- Reduce manual labor
It's about making your product smarter, more efficient, and more valuable by giving it capabilities it didn't have before.
Any domain where there is a huge domain of published knowledge (programming, healthcare) or vast quantities of data (e-commerce, financial services, health, manufacturing etc), too large for the human brain to comprehend, AI has a place and will outperform what we currently do.
So how do you go about adding AI ?
1. Requirements
2. Model
The recent explosion of interest in AI is largely due to Large Language Models (LLMs) like ChatGPT. At its core, the LLM is a text prediction engine. Give it some text and it will give you text that likely to follow.
But beyond text generation, LLMs have been been trained with a lot of published digital data and they retain associations between text. On top of it, they are trained with real world examples of questions and answers. For example, the reason they do such a good job at generating "programming code" is because they are trained with real source code from github repositories.
What model to use ?
The choices are:
- Commercial LLMs like ChatGpt, Claude, Gemini etc
- Open source LLMs like Llama, Mistral, DeepSeek etc
- Traditional ML models
3. Agent
- Accepts requests either from a UI or another service
- Makes requests to the model on behalf of your system
- Makes multiple API calls to systems to fetch data
- May search the internet
- May save state to a database at various times
- In the end, returns a response or start some process to finish a task
4. Data pipeline
A generic AI model can only do so much. Even without additional training, just adding your data to the prompts can yield better results.
The data pipeline is what makes the data in your databases, logs, ticket systems, github, Jira etc available to the models and agents.
- get the data from source
- clean it
- format it
- transform it
- use it in either prompts or to further train the model
5. Monitoring
Now let us seem how these concepts translate into some very simple real-world applications across different industries.
Examples
1. Healthcare: Enhancing Diagnostics and Patient Experience
Adding AI can mean:
Personalized Treatment Pathways: An AI Agent can analyze vast amounts of research papers, clinical trial data, and individual patient responses to suggest the most effective treatment plan tailored to a specific patient's profile.
Example: For a person with high cholesterol, an AI agent can come up with a personalized diet and exercise plan.
2. Finance: Personalized Investing
Adding AI could mean:
Personalized Financial Advice: Here, an AI Agent can serve as a "advisor" to offer highly tailored investment portfolios and financial planning advice.
Example: A banking app's AI agent uses an LLM to understand your financial goals and then uses its "tools" to connect to your accounts, pull real-time market data, and recommend trades on your behalf. It can then use its LLM to explain in simple terms why it made a specific trade or rebalanced your portfolio.
3. E-commerce: Customer Experience
Adding AI could mean:
Personalized shopping: AI models can find the right product at the right price with the right characteristics for user requirement
Example: Instead of me shopping and comparing for hours, AI does it for me and makes a recommendation on the final product to purchase.
In Conclusion
Adding AI to your product to make it better means using the proven power of AI models
- To better answer customer request with insights
- To automate repetitive time consuming task
- To make predictions that were hard earlier
- To gain insights into vast bodies of knowledge
Start small. Focus on one specific business problem you want to solve, and build from there.
Thursday, August 28, 2025
The Unsung Heroes Behind Your AI Coding Assistant
Meet the Code Generation Champions:
- StarCoder - Trained on 80+ programming languages from GitHub repos, this open-source model excels at code completion and generation
- CodeT5 - Google's encoder-decoder model that understands code structure and can translate between languages
- InCoder - Meta's bidirectional model that can fill in code gaps, not just complete from left to right
- CodeGen - Salesforce's autoregressive model trained on both natural language and code
- Codex (OpenAI) - The foundation behind GitHub Copilot, though now evolved into GPT-4 variants
What makes these different from general LLMs?
- Trained on massive code repositories (billions of lines)
- Understand syntax, semantics, and programming patterns
- Can maintain context across entire codebases
- Specialized in code-specific tasks like debugging, refactoring, and documentation
The real game-changer? Most of these models are open-source, democratizing access to powerful coding assistance beyond just the big tech companies.

