LlamaIndexIntermediate35 min

LlamaIndex + HatiData: Structured RAG

Build a RAG pipeline with LlamaIndex that queries structured SQL and semantic vector search through HatiData.

What You'll Build

A LlamaIndex agent with FunctionTools for HatiData queries and semantic_rank() relevance scoring.

Prerequisites

$pip install hatidata-agent llama-index

$hati init

Architecture

┌──────────────┐    ┌──────────────┐    ┌──────────────┐
│  LlamaIndex  │───▶│  HatiData    │───▶│ Structured   │
│  ReAct Agent │    │  MCP Server  │    │ + Vector Data│
└──────────────┘    └──────────────┘    └──────────────┘

Key Concepts

●Structured RAG combines SQL pre-filtering with semantic vector search — unlike pure vector search, you get precision AND relevance in a single query
●JOIN_VECTOR(table, column, query, top_k) enables hybrid retrieval by joining structured tables with vector similarity results inline
●semantic_rank(embedding, query) returns a 0-to-1 relevance score that can be used for ordering, thresholding, and explaining retrieval decisions to users
●SQL WHERE clauses act as a pre-filter before semantic ranking, dramatically reducing noise compared to vector-only retrieval approaches

Step-by-Step Implementation

Install Dependencies

Install the LlamaIndex framework and HatiData Python packages.

Bash

pip install llama-index llama-index-core llama-index-llms-openai hatidata-agent psycopg2-binary

Note: LlamaIndex requires Python 3.9+. The hatidata-agent package includes the MCP client and SQL driver.

Configure HatiData with Document Schema

Create a structured table for documents with embeddings, then load sample data for the RAG pipeline.

Python

from hatidata_agent import HatiDataAgent

client = HatiDataAgent(host="localhost", port=5439, agent_id="llamaindex-rag", framework="llamaindex")

# Create a documents table with structured metadata + embeddings
client.execute("""
    CREATE TABLE IF NOT EXISTS documents (
        doc_id VARCHAR PRIMARY KEY,
        title VARCHAR NOT NULL,
        content TEXT NOT NULL,
        category VARCHAR,
        author VARCHAR,
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
        embedding FLOAT[384]
    )
""")

# Insert sample documents
client.execute("""
    INSERT INTO documents (doc_id, title, content, category, author) VALUES
    ('doc_001', 'Q3 Revenue Report',
     'Total revenue reached $4.2M in Q3, driven by enterprise contracts. APAC grew 34% QoQ.',
     'finance', 'CFO'),
    ('doc_002', 'Security Incident Postmortem',
     'On Sept 15, a misconfigured S3 bucket exposed staging logs for 2 hours. No customer data affected.',
     'security', 'CISO'),
    ('doc_003', 'Product Roadmap 2025',
     'Key initiatives: MCP-native integrations, branch isolation GA, and Arrow query endpoints.',
     'product', 'VP Product'),
    ('doc_004', 'Customer Churn Analysis',
     'Churn rate dropped to 3.1% after adding persistent memory features. Top retention driver: semantic search.',
     'analytics', 'Head of CS'),
    ('doc_005', 'Compliance Audit Findings',
     'SOC 2 Type II audit completed. Zero critical findings. Two minor observations on key rotation frequency.',
     'compliance', 'GRC Lead')
""")

print("Schema created and sample data loaded.")

Expected Output

Schema created and sample data loaded.

Create HatiData Query Tools for LlamaIndex

Wrap HatiData SQL queries and semantic_rank() as LlamaIndex FunctionTools that the ReAct agent can invoke.

Python

from llama_index.core.tools import FunctionTool
from hatidata_agent import HatiDataAgent

client = HatiDataAgent(host="localhost", port=5439, agent_id="llamaindex-rag", framework="llamaindex")


def search_documents(query: str, category: str = None, top_k: int = 5) -> str:
    """Search documents using hybrid SQL + vector retrieval.
    Combines structured SQL filters with semantic ranking."""
    where_clause = f"WHERE d.category = '{category}'" if category else ""
    results = client.query(f"""
        SELECT d.doc_id, d.title, d.content, d.category,
               semantic_rank(d.embedding, '{query}') AS relevance
        FROM documents d
        JOIN_VECTOR(documents, embedding, '{query}', {top_k}) v
          ON d.doc_id = v.doc_id
        {where_clause}
        ORDER BY relevance DESC
        LIMIT {top_k}
    """)
    return "\n".join([
        f"[{r['relevance']:.3f}] {r['title']}: {r['content']}"
        for r in results
    ])


def sql_analytics(sql_query: str) -> str:
    """Run a structured SQL query against HatiData for precise analytics.
    Use for aggregations, counts, and filtered lookups."""
    results = client.query(sql_query)
    return str(results)


# Wrap as LlamaIndex FunctionTools
search_tool = FunctionTool.from_defaults(
    fn=search_documents,
    name="search_documents",
    description=(
        "Search documents using hybrid SQL + semantic vector search. "
        "Supports optional category filtering and relevance ranking."
    ),
)

analytics_tool = FunctionTool.from_defaults(
    fn=sql_analytics,
    name="sql_analytics",
    description=(
        "Run structured SQL queries for precise analytics: counts, "
        "aggregations, JOINs, and filtered lookups on HatiData tables."
    ),
)

print("Tools created: search_documents, sql_analytics")

Expected Output

Tools created: search_documents, sql_analytics

Build the ReAct Agent

Create a LlamaIndex ReActAgent with HatiData tools and a system prompt optimized for structured RAG workflows.

Python

from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o", temperature=0)

agent = ReActAgent.from_tools(
    tools=[search_tool, analytics_tool],
    llm=llm,
    verbose=True,
    system_prompt="""You are a knowledge assistant with access to a HatiData document store.

You have two tools:
1. search_documents: semantic search with relevance ranking. Use when the user
   asks vague or conceptual questions.
2. sql_analytics: precise SQL queries. Use when the user asks for specific
   numbers, counts, or aggregations.

Strategy:
- For factual lookups, prefer sql_analytics with exact SQL.
- For conceptual questions, prefer search_documents with semantic matching.
- For complex questions, combine both: filter with SQL, rank with semantics.
- Always cite the document title and relevance score in your answers.
""",
)

print("ReAct agent ready with HatiData tools.")

Expected Output

ReAct agent ready with HatiData tools.

Run Structured Queries with Semantic Ranking

Ask the agent questions that demonstrate hybrid retrieval — SQL filters combined with JOIN_VECTOR semantic relevance.

Python

# Hybrid query: SQL filter + semantic ranking
response = agent.chat(
    "What security incidents have we had, and what was the impact?"
)
print(response)

# Structured query: precise SQL analytics
response = agent.chat(
    "How many documents do we have in each category?"
)
print(response)

# Complex hybrid: filter by category, then rank semantically
response = agent.chat(
    "Find finance-related documents about revenue growth in APAC."
)
print(response)

Expected Output

Thought: The user is asking about security incidents. This is a conceptual
question, so I will use search_documents with a relevant query.

Action: search_documents
Action Input: {"query": "security incident impact", "category": "security"}

Observation: [0.891] Security Incident Postmortem: On Sept 15, a misconfigured
S3 bucket exposed staging logs for 2 hours. No customer data affected.

Answer: We had one security incident: a misconfigured S3 bucket on September
15th that exposed staging logs for 2 hours. The impact was limited and no
customer data was affected. (Relevance: 0.891)

Note: The ReAct agent automatically chooses between semantic search and SQL analytics based on the question type. Hybrid queries combine SQL WHERE clauses with semantic_rank() scoring.

Compare with Pure Vector Search

Demonstrate the advantage of structured RAG (SQL filters + semantic relevance) over pure vector search alone.

Python

# Pure vector search: returns results from ALL categories
pure_vector = client.query("""
    SELECT d.title, d.category,
           semantic_rank(d.embedding, 'revenue growth') AS relevance
    FROM documents d
    JOIN_VECTOR(documents, embedding, 'revenue growth', 5) v
      ON d.doc_id = v.doc_id
    ORDER BY relevance DESC
""")
print("=== Pure Vector Search: 'revenue growth' ===")
for r in pure_vector:
    print(f"  [{r['relevance']:.3f}] [{r['category']}] {r['title']}")

# Structured RAG: SQL pre-filter THEN semantic rank
structured_rag = client.query("""
    SELECT d.title, d.category,
           semantic_rank(d.embedding, 'revenue growth') AS relevance
    FROM documents d
    JOIN_VECTOR(documents, embedding, 'revenue growth', 5) v
      ON d.doc_id = v.doc_id
    WHERE d.category = 'finance'
    ORDER BY relevance DESC
""")
print("\n=== Structured RAG: 'revenue growth' + category=finance ===")
for r in structured_rag:
    print(f"  [{r['relevance']:.3f}] [{r['category']}] {r['title']}")

print("\nStructured RAG eliminates irrelevant results by combining")
print("SQL filters with semantic ranking: fewer results, higher precision.")

Expected Output

=== Pure Vector Search: 'revenue growth' ===
  [0.847] [finance] Q3 Revenue Report
  [0.612] [analytics] Customer Churn Analysis
  [0.534] [product] Product Roadmap 2025
  [0.401] [compliance] Compliance Audit Findings
  [0.312] [security] Security Incident Postmortem

=== Structured RAG: 'revenue growth' + category=finance ===
  [0.847] [finance] Q3 Revenue Report

Structured RAG eliminates irrelevant results by combining
SQL filters with semantic ranking: fewer results, higher precision.

Note: Pure vector search returns results from every category, including irrelevant ones. Structured RAG uses SQL WHERE clauses to pre-filter, then semantic_rank() to score. You get precision AND relevance in one query.

Related Use Case

Operations

Customer Support

Agents That Remember Every Customer

Ready to build?

Install HatiData locally and start building with LlamaIndex in minutes.

Join Waitlist