AI Engineering

Introducing HatiData: Agent-Native Data Infrastructure

HatiData TeamMarch 3, 20268 min read

Why We Built HatiData

Every team building production AI agents assembles the same fragile stack. A vector database for memory. A separate SQL database for structured data. Redis for session state. A tracing tool for debugging. A cloud data platform for analytics. Five vendors, five bills, five failure modes, five sets of credentials to manage, five services to monitor, and five potential points of data leakage.

We watched this pattern repeat across dozens of teams — from startups building their first agent to enterprises deploying multi-agent systems at scale. The integration code alone consumed weeks of engineering time. The operational overhead consumed more. And the resulting architecture was brittle: when one component went down, the entire agent pipeline stalled.

We built HatiData to collapse this stack into one platform. Not by building yet another wrapper, but by designing a data platform from the ground up around the requirements of autonomous AI agents.

What HatiData Is

HatiData is an agent-native data platform that combines SQL queries, vector search, persistent memory, chain-of-thought auditing, semantic triggers, and branch isolation — in a single deployment that runs in your VPC.

At its core, HatiData is a query engine. It speaks standard Postgres wire protocol. Any tool that connects to Postgres — psql, DBeaver, SQLAlchemy, Prisma, dbt — connects to HatiData without modification. Under the hood, queries are executed by an embedded columnar engine with data stored in an open table format. This means columnar performance on analytical workloads, open storage formats with no vendor lock-in, and sub-second query latency for the kind of small, frequent queries that agents run.

But HatiData is more than a query engine. It is a platform that understands agents as first-class entities. Agents have identities, permissions, memory, reasoning trails, and compute quotas — all managed through SQL and a control plane API.

What Makes HatiData Different

Standard SQL Interface

HatiData speaks Postgres wire protocol. You connect with psql -h localhost -p 5439. Your existing SQL tools, ORMs, and libraries work without changes. There is no proprietary query language to learn, no SDK lock-in, no vendor-specific API to integrate.

For agent-native features, HatiData extends SQL with functions like store_memory(), semantic_match(), and semantic_rank(). These are standard SQL function calls — they work in SELECT statements, WHERE clauses, and JOIN conditions.

In-VPC Deployment

The HatiData data plane runs in your network. Your data never traverses the public internet, never lands on shared infrastructure, never leaves your VPC. For regulated industries — financial services, healthcare, government — this is not a nice-to-have. It is a requirement.

The control plane (auth, billing, governance) runs on HatiData's infrastructure and communicates with the data plane over encrypted channels. But query execution, data storage, and agent memory all stay in your environment.

24 MCP Tools

HatiData ships with 24 Model Context Protocol (MCP) tools that work with Claude, Cursor, and any MCP-compatible client. These tools cover the full lifecycle of agent data operations:

Memory tools: store_memory, search_memory, get_agent_state, set_agent_state, delete_memory
Reasoning tools: log_reasoning_step, replay_decision, get_session_history
Branch tools: branch_create, branch_query, branch_merge, branch_discard, branch_list
Trigger tools: register_trigger, list_triggers, delete_trigger, test_trigger
Query tools: query, list_tables, describe_table, and more

Per-Second Billing with Auto-Suspend

Traditional cloud data platforms bill in 60-second minimums. An agent that runs a 200-millisecond query pays for 60 seconds. HatiData bills per second with 5-second auto-suspend. When your agent is not querying, you pay nothing. When it resumes, cold start is under 500 milliseconds.

For a typical agent workload — hundreds of short queries per day with long idle periods — this translates to 10-30x cost reduction compared to legacy platforms with minimum billing increments.

Open Storage, No Lock-In

Data is stored in an open table format on your own object storage (S3, GCS, or Azure Blob Storage). If you ever want to stop using HatiData, your data is right where you left it — readable by Spark, Trino, Flink, or any compatible engine. There is no proprietary format, no export process, no migration headache.

Architecture

HatiData has three components:

Data Plane — Proxy: Speaks Postgres wire protocol. Receives SQL queries, runs them through a multi-stage query pipeline (policy check, cost estimate, quota check, row filter, AI transpile, execute, column mask, meter, audit), and returns results. Runs in your VPC.

Control Plane: An API server with 100+ endpoints handling authentication (JWT, API key, federated), authorization (ABAC), billing, organization management, and governance. Runs on HatiData infrastructure.

MCP Server: Exposes 24 agent-native tools via the Model Context Protocol. Connects Claude, Cursor, and other MCP-compatible clients to HatiData's full feature set.

What's Next

We are currently onboarding the Founding 20 — our first design partners who will shape the product's roadmap. If you are building production AI agents and want infrastructure designed for agents rather than repurposed from analytics, we want to hear from you.

Get started locally:

bash

curl -fsSL https://hatidata.com/install.sh | sh
hati init

Or explore the framework integrations for LangChain, CrewAI, and AutoGen in our documentation.