LangChain + HatiData: Persistent Memory in 5 Minutes
Give Your LangChain Agent a Brain
LangChain agents lose their memory when the process restarts. ConversationBufferMemory stores conversation history in RAM. ConversationSummaryMemory compresses it, but still lives in memory. When the agent stops — whether from a deployment, a crash, or a simple restart — everything it learned disappears.
For prototypes, this is fine. For production agents that interact with customers across days, weeks, or months, this is a serious limitation. A customer who explained their preferences yesterday should not have to repeat themselves today.
This tutorial shows how to connect LangChain to HatiData for persistent memory that survives restarts, supports semantic search across all past interactions, and scales to millions of stored memories. Total setup time: under 5 minutes.
Prerequisites
You need:
- Python 3.10+
- Docker (for running HatiData locally)
- An OpenAI API key (for the LangChain LLM)
Step 1: Install and Initialize HatiData
Install the HatiData CLI and start a local instance:
curl -fsSL https://hatidata.com/install.sh | sh
hati initThis starts the HatiData proxy and the MCP server. Verify it is running:
psql -h localhost -p 5439 -U admin -c "SELECT 1"If you see a result of 1, HatiData is ready.
Step 2: Install Python Dependencies
pip install langchain langchain-openai hatidata-agentThe hatidata-agent package provides the Python client that connects to HatiData's Postgres-compatible interface.
Step 3: Store Memories
Every interaction your agent has can be persisted as a memory. Memories are stored with content (the text), a namespace (for organizing and isolating), and an automatically generated embedding for semantic search.
from hatidata_agent import HatiDataAgent
agent = HatiDataAgent(host="localhost", port=5439, user="admin")
# Store some memories
agent.execute("SELECT store_memory('User prefers short, direct answers', 'user-prefs')")
agent.execute("SELECT store_memory('User works in financial services', 'user-prefs')")
agent.execute("SELECT store_memory('User asked about SOC 2 compliance last week', 'user-history')")
agent.execute("SELECT store_memory('User is evaluating data platforms for Q3', 'user-context')")Each store_memory() call persists the content to the query engine and dispatches an embedding job. The embedding is computed asynchronously and stored alongside the memory for semantic search.
Step 4: Retrieve with Semantic Search
The power of persistent memory is not just storage — it is retrieval by meaning. Instead of exact keyword matching, semantic_match() finds memories that are conceptually related to your query:
memories = agent.query("""
SELECT content, namespace, created_at
FROM _hatidata_memory.memories
WHERE semantic_match(embedding, 'how does this user like responses formatted', 0.65)
ORDER BY semantic_rank(embedding, 'how does this user like responses formatted') DESC
LIMIT 3
""")
for row in memories:
print(f"[{row['namespace']}] {row['content']}")The query above finds memories semantically related to "how does this user like responses formatted" — even though none of the stored memories contain those exact words. The 0.65 threshold filters out low-relevance matches. The semantic_rank() function orders results by cosine similarity.
Step 5: Wire Into a LangChain Chain
Now, combine HatiData memory retrieval with a LangChain conversational chain:
from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage
from hatidata_agent import HatiDataAgent
llm = ChatOpenAI(model="gpt-4o", temperature=0.3)
hati = HatiDataAgent(host="localhost", port=5439, user="admin")
def chat_with_memory(user_message: str, user_id: str) -> str:
# 1. Retrieve relevant memories
memories = hati.query(f"""
SELECT content FROM _hatidata_memory.memories
WHERE namespace = '{user_id}'
AND semantic_match(embedding, '{user_message}', 0.6)
ORDER BY semantic_rank(embedding, '{user_message}') DESC
LIMIT 5
""")
memory_context = "\n".join(f"- {m['content']}" for m in memories)
# 2. Build prompt with memory context
messages = [
SystemMessage(content=f"""You are a helpful assistant.
Here is what you remember about this user:
{memory_context if memory_context else 'No previous memories found.'}
Use these memories to personalize your response."""),
HumanMessage(content=user_message),
]
# 3. Get LLM response
response = llm.invoke(messages)
# 4. Store the interaction as a new memory
summary = f"User asked: {user_message[:100]}. Assistant responded about: {response.content[:100]}"
hati.execute(f"SELECT store_memory('{summary}', '{user_id}')")
return response.content
# Usage
response = chat_with_memory(
"What data platform should I choose for our compliance requirements?",
"user-42"
)
print(response)Each conversation turn retrieves relevant past memories, includes them in the system prompt, generates a response, and stores a summary of the interaction as a new memory. Over time, the agent builds a rich understanding of each user.
What You Get
By connecting LangChain to HatiData, your agent gains:
- Persistent memory that survives process restarts, deployments, and crashes
- Semantic search across all past interactions — find relevant memories by meaning, not just keywords
- Namespace isolation for multi-tenant deployments — each user's memories are logically separated
- Full SQL queryability for analytics and debugging — run ad-hoc queries against the memory store to understand agent behavior
- Scalability to millions of memories with sub-second retrieval times
- Audit trails — every memory storage and retrieval is logged with agent identity and timestamp
Next Steps
This tutorial covered single-agent memory. For multi-agent systems where agents share knowledge, check out the CrewAI multi-agent shared memory cookbook in the HatiData documentation. For production deployments with access control and encryption, see the governance guide.