LlamaIndex + HatiData: Structured RAG
Build a RAG pipeline with LlamaIndex that queries structured SQL and semantic vector search through HatiData.
What You'll Build
A LlamaIndex agent with FunctionTools for HatiData queries and semantic_rank() relevance scoring.
Prerequisites
$pip install hatidata-agent llama-index
$hati init
Architecture
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ LlamaIndex │───▶│ HatiData │───▶│ Structured │ │ ReAct Agent │ │ MCP Server │ │ + Vector Data│ └──────────────┘ └──────────────┘ └──────────────┘
Key Concepts
- ●Structured RAG combines SQL pre-filtering with semantic vector search — unlike pure vector search, you get precision AND relevance in a single query
- ●JOIN_VECTOR(table, column, query, top_k) enables hybrid retrieval by joining structured tables with vector similarity results inline
- ●semantic_rank(embedding, query) returns a 0-to-1 relevance score that can be used for ordering, thresholding, and explaining retrieval decisions to users
- ●SQL WHERE clauses act as a pre-filter before semantic ranking, dramatically reducing noise compared to vector-only retrieval approaches
Step-by-Step Implementation
Install Dependencies
Install the LlamaIndex framework and HatiData Python packages.
pip install llama-index llama-index-core llama-index-llms-openai hatidata-agent psycopg2-binaryNote: LlamaIndex requires Python 3.9+. The hatidata-agent package includes the MCP client and SQL driver.
Configure HatiData with Document Schema
Create a structured table for documents with embeddings, then load sample data for the RAG pipeline.
from hatidata_agent import HatiDataAgent
client = HatiDataAgent(host="localhost", port=5439, agent_id="llamaindex-rag", framework="llamaindex")
# Create a documents table with structured metadata + embeddings
client.execute("""
CREATE TABLE IF NOT EXISTS documents (
doc_id VARCHAR PRIMARY KEY,
title VARCHAR NOT NULL,
content TEXT NOT NULL,
category VARCHAR,
author VARCHAR,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
embedding FLOAT[384]
)
""")
# Insert sample documents
client.execute("""
INSERT INTO documents (doc_id, title, content, category, author) VALUES
('doc_001', 'Q3 Revenue Report',
'Total revenue reached $4.2M in Q3, driven by enterprise contracts. APAC grew 34% QoQ.',
'finance', 'CFO'),
('doc_002', 'Security Incident Postmortem',
'On Sept 15, a misconfigured S3 bucket exposed staging logs for 2 hours. No customer data affected.',
'security', 'CISO'),
('doc_003', 'Product Roadmap 2025',
'Key initiatives: MCP-native integrations, branch isolation GA, and Arrow query endpoints.',
'product', 'VP Product'),
('doc_004', 'Customer Churn Analysis',
'Churn rate dropped to 3.1% after adding persistent memory features. Top retention driver: semantic search.',
'analytics', 'Head of CS'),
('doc_005', 'Compliance Audit Findings',
'SOC 2 Type II audit completed. Zero critical findings. Two minor observations on key rotation frequency.',
'compliance', 'GRC Lead')
""")
print("Schema created and sample data loaded.")Schema created and sample data loaded.Create HatiData Query Tools for LlamaIndex
Wrap HatiData SQL queries and semantic_rank() as LlamaIndex FunctionTools that the ReAct agent can invoke.
from llama_index.core.tools import FunctionTool
from hatidata_agent import HatiDataAgent
client = HatiDataAgent(host="localhost", port=5439, agent_id="llamaindex-rag", framework="llamaindex")
def search_documents(query: str, category: str = None, top_k: int = 5) -> str:
"""Search documents using hybrid SQL + vector retrieval.
Combines structured SQL filters with semantic ranking."""
where_clause = f"WHERE d.category = '{category}'" if category else ""
results = client.query(f"""
SELECT d.doc_id, d.title, d.content, d.category,
semantic_rank(d.embedding, '{query}') AS relevance
FROM documents d
JOIN_VECTOR(documents, embedding, '{query}', {top_k}) v
ON d.doc_id = v.doc_id
{where_clause}
ORDER BY relevance DESC
LIMIT {top_k}
""")
return "\n".join([
f"[{r['relevance']:.3f}] {r['title']}: {r['content']}"
for r in results
])
def sql_analytics(sql_query: str) -> str:
"""Run a structured SQL query against HatiData for precise analytics.
Use for aggregations, counts, and filtered lookups."""
results = client.query(sql_query)
return str(results)
# Wrap as LlamaIndex FunctionTools
search_tool = FunctionTool.from_defaults(
fn=search_documents,
name="search_documents",
description=(
"Search documents using hybrid SQL + semantic vector search. "
"Supports optional category filtering and relevance ranking."
),
)
analytics_tool = FunctionTool.from_defaults(
fn=sql_analytics,
name="sql_analytics",
description=(
"Run structured SQL queries for precise analytics: counts, "
"aggregations, JOINs, and filtered lookups on HatiData tables."
),
)
print("Tools created: search_documents, sql_analytics")Tools created: search_documents, sql_analyticsBuild the ReAct Agent
Create a LlamaIndex ReActAgent with HatiData tools and a system prompt optimized for structured RAG workflows.
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI
llm = OpenAI(model="gpt-4o", temperature=0)
agent = ReActAgent.from_tools(
tools=[search_tool, analytics_tool],
llm=llm,
verbose=True,
system_prompt="""You are a knowledge assistant with access to a HatiData document store.
You have two tools:
1. search_documents: semantic search with relevance ranking. Use when the user
asks vague or conceptual questions.
2. sql_analytics: precise SQL queries. Use when the user asks for specific
numbers, counts, or aggregations.
Strategy:
- For factual lookups, prefer sql_analytics with exact SQL.
- For conceptual questions, prefer search_documents with semantic matching.
- For complex questions, combine both: filter with SQL, rank with semantics.
- Always cite the document title and relevance score in your answers.
""",
)
print("ReAct agent ready with HatiData tools.")ReAct agent ready with HatiData tools.Run Structured Queries with Semantic Ranking
Ask the agent questions that demonstrate hybrid retrieval — SQL filters combined with JOIN_VECTOR semantic relevance.
# Hybrid query: SQL filter + semantic ranking
response = agent.chat(
"What security incidents have we had, and what was the impact?"
)
print(response)
# Structured query: precise SQL analytics
response = agent.chat(
"How many documents do we have in each category?"
)
print(response)
# Complex hybrid: filter by category, then rank semantically
response = agent.chat(
"Find finance-related documents about revenue growth in APAC."
)
print(response)Thought: The user is asking about security incidents. This is a conceptual
question, so I will use search_documents with a relevant query.
Action: search_documents
Action Input: {"query": "security incident impact", "category": "security"}
Observation: [0.891] Security Incident Postmortem: On Sept 15, a misconfigured
S3 bucket exposed staging logs for 2 hours. No customer data affected.
Answer: We had one security incident: a misconfigured S3 bucket on September
15th that exposed staging logs for 2 hours. The impact was limited and no
customer data was affected. (Relevance: 0.891)Note: The ReAct agent automatically chooses between semantic search and SQL analytics based on the question type. Hybrid queries combine SQL WHERE clauses with semantic_rank() scoring.
Compare with Pure Vector Search
Demonstrate the advantage of structured RAG (SQL filters + semantic relevance) over pure vector search alone.
# Pure vector search: returns results from ALL categories
pure_vector = client.query("""
SELECT d.title, d.category,
semantic_rank(d.embedding, 'revenue growth') AS relevance
FROM documents d
JOIN_VECTOR(documents, embedding, 'revenue growth', 5) v
ON d.doc_id = v.doc_id
ORDER BY relevance DESC
""")
print("=== Pure Vector Search: 'revenue growth' ===")
for r in pure_vector:
print(f" [{r['relevance']:.3f}] [{r['category']}] {r['title']}")
# Structured RAG: SQL pre-filter THEN semantic rank
structured_rag = client.query("""
SELECT d.title, d.category,
semantic_rank(d.embedding, 'revenue growth') AS relevance
FROM documents d
JOIN_VECTOR(documents, embedding, 'revenue growth', 5) v
ON d.doc_id = v.doc_id
WHERE d.category = 'finance'
ORDER BY relevance DESC
""")
print("\n=== Structured RAG: 'revenue growth' + category=finance ===")
for r in structured_rag:
print(f" [{r['relevance']:.3f}] [{r['category']}] {r['title']}")
print("\nStructured RAG eliminates irrelevant results by combining")
print("SQL filters with semantic ranking: fewer results, higher precision.")=== Pure Vector Search: 'revenue growth' ===
[0.847] [finance] Q3 Revenue Report
[0.612] [analytics] Customer Churn Analysis
[0.534] [product] Product Roadmap 2025
[0.401] [compliance] Compliance Audit Findings
[0.312] [security] Security Incident Postmortem
=== Structured RAG: 'revenue growth' + category=finance ===
[0.847] [finance] Q3 Revenue Report
Structured RAG eliminates irrelevant results by combining
SQL filters with semantic ranking: fewer results, higher precision.Note: Pure vector search returns results from every category, including irrelevant ones. Structured RAG uses SQL WHERE clauses to pre-filter, then semantic_rank() to score. You get precision AND relevance in one query.
Related Use Case
Operations
Customer Support
Agents That Remember Every Customer