Ollama + HatiData: Air-Gapped Local AI
Run AI agents completely locally with Ollama and HatiData. Zero cloud dependencies.
What You'll Build
A fully local AI agent with Ollama for inference and HatiData for persistent memory and SQL queries.
Prerequisites
$pip install hatidata-agent ollama
$hati init
$ollama pull llama3
Architecture
┌──────────────┐ ┌──────────────┐
│ Ollama │───▶│ HatiData │
│ (Local LLM) │ │ (Local DB) │
└──────────────┘ └──────────────┘
100% Local — No Cloud RequiredKey Concepts
- ●Zero cloud dependencies: Ollama runs LLM inference locally, HatiData stores data in a local DuckDB file, and the built-in ONNX model handles embeddings — nothing touches the network
- ●Air-gapped operation: the entire stack works without internet access, making it suitable for classified environments, edge deployments, and data-sovereign workloads
- ●Local-first data sovereignty: all agent memories, embeddings, and query results stay on your machine in a single DuckDB file that you fully control
- ●Ollama + DuckDB performance: local LLM inference avoids API latency while DuckDB provides sub-millisecond SQL queries — often faster than cloud round-trips
Step-by-Step Implementation
Install Ollama and HatiData
Install both Ollama for local LLM inference and HatiData for persistent agent memory. Pull a model and initialize the database.
# Install Ollama (macOS/Linux)
curl -fsSL https://ollama.com/install.sh | sh
# Pull a local model
ollama pull llama3
# Install HatiData agent library
pip install hatidata-agent ollama
# Initialize the local HatiData database
hati init>>> pulling llama3... done
HatiData initialized at ./hatidata/
Storage: ./hatidata/data.duckdb
Config: ./.hati/config.toml
MCP: ready on port 8741Note: Ollama runs entirely on your machine with no API keys. HatiData stores everything in a local DuckDB file. Zero cloud dependencies from the start.
Configure for Fully Local Operation
Create a HatiData configuration file that explicitly disables all cloud features for air-gapped environments.
# .hati/config.toml — fully local configuration
# No cloud endpoints, no telemetry, no external calls
[storage]
path = "./agent_data"
[memory]
default_namespace = "local_agent"
embedding_dimensions = 384
embedding_provider = "local" # uses built-in ONNX model
[proxy]
port = 5439
host = "127.0.0.1"
[cloud]
enabled = false
telemetry = falseNote: Setting embedding_provider to 'local' uses the built-in ONNX embedding model. No OpenAI or Cohere API calls. The entire stack runs offline.
Build the Local Agent
Create an agent that uses Ollama for LLM inference and HatiData MCP tools for persistent memory, running entirely on your machine.
import ollama
from hatidata_agent import HatiDataAgent
# Connect to local HatiData
hati = HatiDataAgent(host="localhost", port=5439, agent_id="local-agent", framework="ollama")
# Agent system prompt
SYSTEM_PROMPT = """You are a helpful local AI assistant with persistent memory.
You can remember information across conversations.
When the user shares important facts, store them in memory.
When answering questions, check your memory first for relevant context.
You run 100% locally — no data ever leaves this machine."""
def store_memory(content: str, tags: list[str]) -> str:
"""Store a memory in HatiData."""
hati.execute(
"SELECT store_memory(?, ?, 'local_agent')",
[content, ",".join(tags)]
)
return f"Stored: {content}"
def search_memory(query: str, top_k: int = 5) -> list[dict]:
"""Search memories using semantic similarity."""
results = hati.query(f"""
SELECT content, tags,
semantic_rank(embedding, '{query}') AS relevance
FROM _hatidata_memory.memories
WHERE namespace = 'local_agent'
ORDER BY relevance DESC
LIMIT {top_k}
""")
return results
def chat(user_message: str) -> str:
"""Run one turn of conversation with memory."""
# Check memory for relevant context
memories = search_memory(user_message, top_k=3)
context = ""
if memories and memories[0]["relevance"] > 0.5:
context = "\nRelevant memories:\n" + "\n".join(
f"- {m['content']} (relevance: {m['relevance']:.2f})"
for m in memories
)
# Call Ollama locally
response = ollama.chat(
model="llama3",
messages=[
{"role": "system", "content": SYSTEM_PROMPT + context},
{"role": "user", "content": user_message},
],
)
reply = response["message"]["content"]
# Store important facts from the conversation
if any(kw in user_message.lower() for kw in ["my name", "i prefer", "remember"]):
store_memory(user_message, ["user_fact", "conversation"])
return reply
# Test the local agent
print(chat("My name is Marcus and I work on embedded systems."))
print(chat("What do you know about me?"))I'll remember that, Marcus! I've noted that you work on embedded systems.
Since we're running entirely locally, your data stays on this machine.
Based on my memory, your name is Marcus and you work on embedded systems.
This was stored from our earlier conversation. (relevance: 0.94)Note: The ollama.chat() call runs locally on your GPU or CPU. The HatiDataAgent connects to the local DuckDB instance. Nothing touches the network.
Test Offline Memory Persistence
Verify that memories persist across agent restarts without any internet connection. Simulate an air-gapped environment.
# Simulate restart: create a brand new client connection
from hatidata_agent import HatiDataAgent
# New connection — simulates agent restart
client = HatiDataAgent(host="localhost", port=5439, agent_id="local-agent", framework="ollama")
# Verify memories survived the restart
memories = client.query("""
SELECT memory_id, content, tags, created_at
FROM _hatidata_memory.memories
WHERE namespace = 'local_agent'
ORDER BY created_at DESC
LIMIT 10
""")
print(f"Found {len(memories)} persisted memories:\n")
for m in memories:
print(f" [{m['created_at']}] {m['content']}")
print(f" Tags: {m['tags']}")
# Semantic search still works without internet
relevant = client.query("""
SELECT content,
semantic_rank(embedding, 'what does the user work on') AS score
FROM _hatidata_memory.memories
WHERE namespace = 'local_agent'
ORDER BY score DESC
LIMIT 3
""")
print("\nSemantic search (offline):")
for r in relevant:
print(f" [{r['score']:.3f}] {r['content']}")Found 2 persisted memories:
[2025-01-15 14:22:01] My name is Marcus and I work on embedded systems.
Tags: ['user_fact', 'conversation']
[2025-01-15 14:21:58] What do you know about me?
Tags: ['user_fact', 'conversation']
Semantic search (offline):
[0.891] My name is Marcus and I work on embedded systems.
[0.234] What do you know about me?Note: Memories persist in the local DuckDB file. The built-in ONNX embedding model runs semantic search without any network calls. Disconnect your WiFi to verify.
Run SQL Queries on Local Data
Connect directly to HatiData on port 5439 and run SQL analytics on all stored agent data.
from hatidata_agent import HatiDataAgent
client = HatiDataAgent(host="localhost", port=5439, agent_id="local-agent", framework="ollama")
# Count memories by namespace
stats = client.query("""
SELECT namespace, COUNT(*) AS memory_count,
MIN(created_at) AS first_memory,
MAX(created_at) AS last_memory
FROM _hatidata_memory.memories
GROUP BY namespace
ORDER BY memory_count DESC
""")
print("=== Memory Statistics ===")
for s in stats:
print(f" {s['namespace']}: {s['memory_count']} memories")
print(f" First: {s['first_memory']}")
print(f" Last: {s['last_memory']}")
# Run analytics on stored data
tag_analysis = client.query("""
SELECT UNNEST(string_split(tags, ',')) AS tag,
COUNT(*) AS count
FROM _hatidata_memory.memories
GROUP BY tag
ORDER BY count DESC
LIMIT 10
""")
print("\n=== Tag Distribution ===")
for t in tag_analysis:
print(f" {t['tag']}: {t['count']} occurrences")
# Export all memories as JSON for backup
client.execute("""
COPY (
SELECT memory_id, namespace, content, tags, created_at
FROM _hatidata_memory.memories
) TO './backup_memories.json' (FORMAT JSON)
""")
print("\nBackup exported to ./backup_memories.json")=== Memory Statistics ===
local_agent: 2 memories
First: 2025-01-15 14:21:58
Last: 2025-01-15 14:22:01
=== Tag Distribution ===
user_fact: 2 occurrences
conversation: 2 occurrences
Backup exported to ./backup_memories.jsonNote: HatiData exposes all data as standard SQL tables on port 5439. You can connect with psql, DBeaver, or any Postgres-compatible tool for ad-hoc queries and exports.