AI Engineering

LangChain + HatiData: Persistent Memory Deep Dive

HatiData Team8 min read

Why LangChain Needs Persistent Memory

LangChain's built-in memory classes — ConversationBufferMemory, ConversationSummaryMemory, ConversationEntityMemory — all share a fundamental limitation: they store state in Python objects. When the process restarts, the memory is gone. When you deploy a new version, the memory is gone. When the container scales down, the memory is gone.

For prototype agents, this is acceptable. For production agents that maintain long-running relationships with customers, accumulate domain knowledge, and build contextual understanding over weeks and months, ephemeral memory is a dealbreaker.

HatiData's LangChain integration replaces ephemeral memory with persistent, searchable, namespace-isolated memory that survives restarts, scales to millions of entries, and supports both semantic and SQL-based retrieval.

The Integration Components

The langchain-hatidata package provides three main components:

  1. 1HatiDataMemory — A LangChain BaseMemory implementation that stores and retrieves conversation memories from HatiData
  2. 2HatiDataVectorStore — A LangChain VectorStore implementation backed by HatiData's hybrid SQL + vector search
  3. 3HatiDataToolkit — A LangChain Toolkit with 4 tools for direct agent interaction with HatiData

Installation

bash
pip install langchain-hatidata

The package requires a running HatiData instance (local or cloud) and an API key.

HatiDataMemory: Drop-In Replacement

HatiDataMemory implements LangChain's BaseMemory interface, making it a drop-in replacement for ConversationBufferMemory and similar classes. The key difference is that every memory is persisted to HatiData and searchable across sessions.

python
from langchain_hatidata import HatiDataMemory
from langchain_openai import ChatOpenAI
from langchain.chains import ConversationChain

# Create persistent memory
memory = HatiDataMemory(
    hatidata_url="http://localhost:5439",
    api_key="hd_live_your_key",
    namespace="customer-support",
    memory_key="chat_history",
    return_messages=True,
    k=10,  # Number of relevant memories to retrieve
)

# Use in a conversation chain
llm = ChatOpenAI(model="gpt-4o")
chain = ConversationChain(llm=llm, memory=memory)

# First session
response = chain.predict(input="My order #1234 arrived damaged")
# Memory stored: "User reported damaged order #1234"

# ... process restarts, days pass ...

# New session — memory persists
response = chain.predict(input="Any update on my issue?")
# HatiData retrieves: "User reported damaged order #1234" from previous session
# Agent responds with context from the previous interaction

The k parameter controls how many relevant memories are retrieved for each conversation turn. Unlike ConversationBufferMemory which includes the entire conversation history (growing without bound), HatiDataMemory retrieves only the most relevant memories using semantic search. This keeps the context window focused and efficient.

How It Works Under the Hood

On each conversation turn:

  1. 1Loadmemory.load_memory_variables() embeds the current input, performs a hybrid search against HatiData, and returns the top-K relevant memories
  2. 2Savememory.save_context() stores a summary of the interaction (input + output) as a new memory in HatiData with the configured namespace

The semantic search in step 1 means that the agent retrieves memories by relevance, not recency. A conversation about order #1234 from three weeks ago is retrieved when the user asks about "my damaged order," even if hundreds of other interactions have happened since then.

HatiDataVectorStore: Hybrid Retrieval

The HatiDataVectorStore class implements LangChain's VectorStore interface, providing a familiar API for similarity search while leveraging HatiData's hybrid SQL + vector architecture.

python
from langchain_hatidata import HatiDataVectorStore
from langchain_openai import OpenAIEmbeddings

# Create vector store
vectorstore = HatiDataVectorStore(
    hatidata_url="http://localhost:5439",
    api_key="hd_live_your_key",
    namespace="knowledge-base",
    embedding=OpenAIEmbeddings(),  # Or any LangChain embedding model
)

# Add documents
vectorstore.add_texts(
    texts=["HatiData supports per-second billing with auto-suspend...",
           "Branch isolation uses schema-level isolation...",
           "The CoT ledger uses cryptographic hash chaining..."],
    metadatas=[
        {"source": "docs", "topic": "billing"},
        {"source": "docs", "topic": "branching"},
        {"source": "docs", "topic": "auditing"},
    ],
)

# Similarity search
results = vectorstore.similarity_search(
    "How does billing work?",
    k=3,
    filter={"source": "docs"},
)

The key advantage over a standalone vector database is that metadata filtering happens in the HatiData engine with full SQL capability. You can filter by date ranges, join with other tables, aggregate results, and combine structured filters with semantic ranking — all through the VectorStore interface.

Using with Retrieval Chains

python
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI

# Create retrieval chain with HatiData as the knowledge base
qa_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model="gpt-4o"),
    chain_type="stuff",
    retriever=vectorstore.as_retriever(
        search_type="similarity",
        search_kwargs={"k": 5},
    ),
)

answer = qa_chain.run("What merge strategies are available for branches?")

HatiDataToolkit: Direct Agent Access

The HatiDataToolkit provides 4 LangChain tools that give agents direct access to HatiData operations. Unlike the Memory and VectorStore classes (which are used by chains), the Toolkit is used by agents for explicit, self-directed data operations.

Tool 1: query

Executes a SQL query against the agent's data warehouse and returns results as structured data.

python
from langchain_hatidata import HatiDataToolkit
from langchain.agents import create_openai_tools_agent, AgentExecutor
from langchain_openai import ChatOpenAI

toolkit = HatiDataToolkit(
    hatidata_url="http://localhost:5439",
    api_key="hd_live_your_key",
)

tools = toolkit.get_tools()
# Returns: [query_tool, list_tables_tool, describe_tool, context_search_tool]

llm = ChatOpenAI(model="gpt-4o")
agent = create_openai_tools_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)

result = executor.invoke({
    "input": "How many orders were placed last week by enterprise customers?"
})

Tool 2: list_tables

Returns all tables in the agent's accessible namespaces with row counts and column summaries. The agent uses this to discover what data is available before writing queries.

Tool 3: describe

Returns the full schema for a specific table — column names, types, and sample values. The agent uses this to understand table structure before composing queries.

Tool 4: context_search

Performs semantic search across table descriptions and column names. This is useful when the agent knows what kind of data it needs but does not know which table contains it.

python
# Agent can search for relevant tables by concept
result = executor.invoke({
    "input": "Find data related to customer churn signals"
})
# Agent uses context_search to find tables, then query to retrieve data

Replacing ConversationBufferMemory

The most common migration path is replacing ConversationBufferMemory with HatiDataMemory. Here is a before/after comparison:

Before (ephemeral):

python
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(return_messages=True)
# Problem: Memory lost on restart, grows without bound, no search

After (persistent):

python
from langchain_hatidata import HatiDataMemory

memory = HatiDataMemory(
    hatidata_url="http://localhost:5439",
    api_key="hd_live_your_key",
    namespace="customer-support",
    return_messages=True,
    k=10,
)
# Memory persists, scales to millions, semantic search, namespace isolated

The HatiDataMemory is API-compatible with ConversationBufferMemory — any chain that accepts a BaseMemory instance works without modification.

Multi-Agent Memory Sharing

LangChain agents can share memory through HatiData by using the same namespace:

python
# Agent 1: Customer research
research_memory = HatiDataMemory(
    hatidata_url="http://localhost:5439",
    api_key="hd_live_research_key",
    namespace="shared/customer-intelligence",
    k=5,
)

# Agent 2: Account management
account_memory = HatiDataMemory(
    hatidata_url="http://localhost:5439",
    api_key="hd_live_account_key",
    namespace="shared/customer-intelligence",
    k=5,
)

# Research agent stores insight
research_chain.predict(input="Analyze sentiment for Acme Corp interactions")
# Stores: "Acme Corp showing signs of dissatisfaction with response times"

# Account agent retrieves research agent's insights
account_chain.predict(input="Prepare for my meeting with Acme Corp")
# Retrieves: "Acme Corp showing signs of dissatisfaction with response times"

Both agents read from and write to the same namespace, creating a shared knowledge pool. Namespace-level access controls ensure that agents in other namespaces cannot see this shared data.

Next Steps

The LangChain integration covers memory, vector search, and direct data access. For multi-agent coordination patterns, see the CrewAI shared memory cookbook. For chain-of-thought logging with LangChain callbacks, see the LangChain CoT replay cookbook. For production deployment with access controls and encryption, see the governance guide.

Enjoyed this post?

Get notified when we publish new engineering deep-dives and product updates.

Ready to see the difference?

Run the free audit script in 5 minutes. Or start Shadow Mode and see HatiData run your actual workloads side-by-side.