AI Engineering

LangChain + HatiData: Persistent Memory in 5 Minutes

HatiData Team5 min read

Give Your LangChain Agent a Brain

LangChain agents lose their memory when the process restarts. ConversationBufferMemory stores conversation history in RAM. ConversationSummaryMemory compresses it, but still lives in memory. When the agent stops — whether from a deployment, a crash, or a simple restart — everything it learned disappears.

For prototypes, this is fine. For production agents that interact with customers across days, weeks, or months, this is a serious limitation. A customer who explained their preferences yesterday should not have to repeat themselves today.

This tutorial shows how to connect LangChain to HatiData for persistent memory that survives restarts, supports semantic search across all past interactions, and scales to millions of stored memories. Total setup time: under 5 minutes.

Prerequisites

You need:

  • Python 3.10+
  • Docker (for running HatiData locally)
  • An OpenAI API key (for the LangChain LLM)

Step 1: Install and Initialize HatiData

Install the HatiData CLI and start a local instance:

bash
curl -fsSL https://hatidata.com/install.sh | sh
hati init

This starts the HatiData proxy and the MCP server. Verify it is running:

bash
psql -h localhost -p 5439 -U admin -c "SELECT 1"

If you see a result of 1, HatiData is ready.

Step 2: Install Python Dependencies

bash
pip install langchain langchain-openai hatidata-agent

The hatidata-agent package provides the Python client that connects to HatiData's Postgres-compatible interface.

Step 3: Store Memories

Every interaction your agent has can be persisted as a memory. Memories are stored with content (the text), a namespace (for organizing and isolating), and an automatically generated embedding for semantic search.

python
from hatidata_agent import HatiDataAgent

agent = HatiDataAgent(host="localhost", port=5439, user="admin")

# Store some memories
agent.execute("SELECT store_memory('User prefers short, direct answers', 'user-prefs')")
agent.execute("SELECT store_memory('User works in financial services', 'user-prefs')")
agent.execute("SELECT store_memory('User asked about SOC 2 compliance last week', 'user-history')")
agent.execute("SELECT store_memory('User is evaluating data platforms for Q3', 'user-context')")

Each store_memory() call persists the content to the query engine and dispatches an embedding job. The embedding is computed asynchronously and stored alongside the memory for semantic search.

Step 4: Retrieve with Semantic Search

The power of persistent memory is not just storage — it is retrieval by meaning. Instead of exact keyword matching, semantic_match() finds memories that are conceptually related to your query:

python
memories = agent.query("""
    SELECT content, namespace, created_at
    FROM _hatidata_memory.memories
    WHERE semantic_match(embedding, 'how does this user like responses formatted', 0.65)
    ORDER BY semantic_rank(embedding, 'how does this user like responses formatted') DESC
    LIMIT 3
""")

for row in memories:
    print(f"[{row['namespace']}] {row['content']}")

The query above finds memories semantically related to "how does this user like responses formatted" — even though none of the stored memories contain those exact words. The 0.65 threshold filters out low-relevance matches. The semantic_rank() function orders results by cosine similarity.

Step 5: Wire Into a LangChain Chain

Now, combine HatiData memory retrieval with a LangChain conversational chain:

python
from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage
from hatidata_agent import HatiDataAgent

llm = ChatOpenAI(model="gpt-4o", temperature=0.3)
hati = HatiDataAgent(host="localhost", port=5439, user="admin")

def chat_with_memory(user_message: str, user_id: str) -> str:
    # 1. Retrieve relevant memories
    memories = hati.query(f"""
        SELECT content FROM _hatidata_memory.memories
        WHERE namespace = '{user_id}'
          AND semantic_match(embedding, '{user_message}', 0.6)
        ORDER BY semantic_rank(embedding, '{user_message}') DESC
        LIMIT 5
    """)

    memory_context = "\n".join(f"- {m['content']}" for m in memories)

    # 2. Build prompt with memory context
    messages = [
        SystemMessage(content=f"""You are a helpful assistant.
Here is what you remember about this user:
{memory_context if memory_context else 'No previous memories found.'}

Use these memories to personalize your response."""),
        HumanMessage(content=user_message),
    ]

    # 3. Get LLM response
    response = llm.invoke(messages)

    # 4. Store the interaction as a new memory
    summary = f"User asked: {user_message[:100]}. Assistant responded about: {response.content[:100]}"
    hati.execute(f"SELECT store_memory('{summary}', '{user_id}')")

    return response.content

# Usage
response = chat_with_memory(
    "What data platform should I choose for our compliance requirements?",
    "user-42"
)
print(response)

Each conversation turn retrieves relevant past memories, includes them in the system prompt, generates a response, and stores a summary of the interaction as a new memory. Over time, the agent builds a rich understanding of each user.

What You Get

By connecting LangChain to HatiData, your agent gains:

  • Persistent memory that survives process restarts, deployments, and crashes
  • Semantic search across all past interactions — find relevant memories by meaning, not just keywords
  • Namespace isolation for multi-tenant deployments — each user's memories are logically separated
  • Full SQL queryability for analytics and debugging — run ad-hoc queries against the memory store to understand agent behavior
  • Scalability to millions of memories with sub-second retrieval times
  • Audit trails — every memory storage and retrieval is logged with agent identity and timestamp

Next Steps

This tutorial covered single-agent memory. For multi-agent systems where agents share knowledge, check out the CrewAI multi-agent shared memory cookbook in the HatiData documentation. For production deployments with access control and encryption, see the governance guide.

Enjoyed this post?

Get notified when we publish new engineering deep-dives and product updates.

Ready to see the difference?

Run the free audit script in 5 minutes. Or start Shadow Mode and see HatiData run your actual workloads side-by-side.