Engineering

Chain-of-Thought Ledger: Tamper-Proof Agent Reasoning

HatiData Team8 min read

The Auditability Gap in Agent Systems

When an AI agent makes a consequential decision — approving a loan, escalating a support ticket, modifying production data — someone will eventually ask: why did the agent do that? The answer is usually unsatisfying. Most agent frameworks provide logs of tool calls and API responses, but not the reasoning that connected them. You can see that the agent called a database query, received results, and then took an action, but you cannot see the intermediate thinking that led from data to decision.

This gap matters for compliance, debugging, and trust. Regulated industries need audit trails that show not just what happened, but why. Engineers debugging unexpected behavior need to trace the agent's reasoning chain to find where it went wrong. And organizations building trust in autonomous agents need evidence that the agent's decision-making process was sound.

HatiData's chain-of-thought (CoT) ledger fills this gap with an immutable, hash-chained record of every reasoning step an agent takes. Each step is linked to the previous one via cryptographic hashing, creating a cryptographic proof that the reasoning chain has not been altered after the fact. The ledger is append-only — once a step is recorded, it cannot be modified, deleted, or reordered.

How Hash Chaining Works

The CoT ledger uses the same principle as blockchain, applied at the session level. Each reasoning step includes a hash of the previous step, creating an unbreakable chain from the first step to the last.

When an agent logs a new reasoning step, HatiData computes a cryptographic hash of the step's content concatenated with the previous step's hash:

step_hash = hash(previous_hash + step_type + content + timestamp)

The first step in a session has no previous hash, so it uses a well-known genesis value. Each subsequent step's hash depends on every step that came before it — changing any single step would invalidate all subsequent hashes, making tampering immediately detectable.

Session Start
    |
Step 1: [observation] hash_1 = hash(genesis + "observation" + content_1 + ts_1)
    |
Step 2: [hypothesis] hash_2 = hash(hash_1 + "hypothesis" + content_2 + ts_2)
    |
Step 3: [tool_call]  hash_3 = hash(hash_2 + "tool_call" + content_3 + ts_3)
    |
Step 4: [decision]   hash_4 = hash(hash_3 + "decision" + content_4 + ts_4)
    |
...continues...

HatiData maintains a per-session hash chain tracker that maps session IDs to the most recent hash. This allows multiple agents to log reasoning steps to different sessions concurrently without blocking each other.

Step Types

Each reasoning step has a type that categorizes it within the agent's thinking process. HatiData defines 12 step types that cover the full spectrum of agent reasoning:

  • observation — The agent notices something in its input or environment
  • hypothesis — The agent forms a tentative explanation or prediction
  • decision — The agent commits to a course of action
  • action — The agent executes an action (calling a tool, writing data, etc.)
  • reflection — The agent evaluates the outcome of a previous action
  • error — The agent encounters an error and records it
  • planning — The agent creates or updates a plan for achieving a goal
  • evaluation — The agent assesses options against criteria
  • tool_call — The agent invokes an external tool (with parameters recorded)
  • tool_result — The result returned from a tool invocation
  • memory_recall — The agent retrieves information from its long-term memory
  • context_switch — The agent shifts focus to a different subtask or topic

Each step is stored as a trace record with fields including the session ID, step number, step type, content, confidence score (optional), branch ID (if applicable), parent hash, current hash, and timestamps.

Logging Reasoning Steps

Agents log reasoning steps through the log_reasoning_step MCP tool:

json
{
  "tool": "log_reasoning_step",
  "arguments": {
    "session_id": "sess_research_2026-03-01",
    "step_type": "observation",
    "content": "The customer's usage patterns show a 40% increase in query volume over the past 2 weeks, primarily in the analytics namespace. This suggests growing reliance on data-driven decision making.",
    "confidence": 0.85,
    "metadata": {
      "data_source": "usage_metrics",
      "time_range": "14d"
    }
  }
}

The response includes the computed hash for this step, which the agent can reference in subsequent steps for explicit reasoning chain annotations.

Multiple agents can log to the same session if they are collaborating on a task, and each agent's contributions are identified by their agent ID. The hash chain ensures that the interleaved steps maintain their integrity — the chain is per-session, not per-agent.

Append-Only Enforcement

The CoT ledger enforces immutability at the database level through an append-only enforcer. This component intercepts any SQL statement that targets CoT tables and blocks operations that would modify existing data:

  • UPDATE on CoT tables — blocked
  • DELETE from CoT tables — blocked
  • TRUNCATE of CoT tables — blocked
  • DROP of CoT tables — blocked

Only INSERT (append) and SELECT (read) operations are allowed. This enforcement happens in HatiData's query pipeline, before the statement reaches the query engine, so it cannot be bypassed by creative SQL.

The enforcer pattern ensures that even if an agent has admin-level database access, it cannot modify its own reasoning history. This is a critical property for compliance — auditors need confidence that the audit trail reflects what actually happened, not what someone wished had happened afterward.

Session Replay

The replay_decision MCP tool retrieves the complete reasoning chain for a given session, presented in chronological order with hash verification:

json
{
  "tool": "replay_decision",
  "arguments": {
    "session_id": "sess_research_2026-03-01"
  }
}

The response includes every step in the session, with the full hash chain. The client can verify chain integrity by recomputing each hash from the previous hash and step content — if every computed hash matches the stored hash, the chain is intact.

HatiData also provides a verify_chain function that performs this verification server-side and returns a boolean result along with the position of any integrity violation. This is useful for automated compliance checks that run periodically.

json
{
  "session_id": "sess_research_2026-03-01",
  "chain_valid": true,
  "total_steps": 24,
  "step_types": {
    "observation": 6,
    "hypothesis": 3,
    "tool_call": 5,
    "tool_result": 5,
    "decision": 2,
    "reflection": 2,
    "memory_recall": 1
  },
  "duration_seconds": 847,
  "first_step_at": "2026-03-01T10:00:00Z",
  "last_step_at": "2026-03-01T10:14:07Z"
}

Embedding Sampling

Not every reasoning step needs a vector embedding — computing embeddings for thousands of steps per session would be wasteful. HatiData uses a sampling strategy that embeds a configurable percentage of steps (default 10%) while always embedding steps with critical step types.

Critical step types that are always embedded regardless of the sampling rate:

  • decision — Every decision the agent makes gets an embedding for semantic search
  • error — Error steps are always searchable so you can find similar errors across sessions
  • reflection — Reflections often contain the most insightful content about agent behavior

For non-critical step types (observation, tool_call, tool_result, etc.), the sampling rate determines what percentage get embedded. The sampling is deterministic based on the step hash, so the same step always gets the same sampling decision. This means that replaying a session always includes the same embedded steps.

Embedded reasoning steps are searchable through HatiData's hybrid search, allowing queries like "find all sessions where the agent decided to escalate based on customer sentiment" or "show me error patterns across the past week."

Integration with hatiOS

The CoT ledger integrates with hatiOS (HatiData's companion product for AI agent governance) through three dedicated API endpoints:

  • POST /v1/cot/ingest — Accepts reasoning traces from hatiOS-managed agents and stores them in the HatiData CoT ledger with hash chain verification
  • GET /v1/cot/sessions/{id}/replay — Returns the full session replay for display in the hatiOS dashboard
  • POST /v1/cot/approvals — Records human approval or rejection decisions linked to specific reasoning sessions

This integration means that agents governed by hatiOS's policy engine have their reasoning automatically recorded in HatiData's tamper-proof ledger, creating a complete picture of both what the agent did and why it did it.

Compliance Applications

The CoT ledger directly addresses several regulatory requirements:

  • SOC 2 Type II — Immutable audit trails with cryptographic integrity verification
  • GDPR Article 22 — Documentation of automated decision-making logic, supporting the right to explanation
  • Financial regulations (MiFID II, SEC) — Record-keeping requirements for automated trading and advisory decisions
  • Healthcare (HIPAA) — Audit trails for AI-assisted clinical decisions

For each of these, the key property is that the reasoning record is tamper-proof and complete. The hash chain provides cryptographic evidence that the record has not been modified, and the append-only enforcement prevents deletion of inconvenient entries.

Next Steps

The CoT ledger is most valuable when combined with persistent memory (so the agent can reference past reasoning) and branch isolation (so exploratory reasoning is tagged with the branch context). See the LangChain CoT replay cookbook for a complete integration example, and the compliance documentation for detailed regulatory mapping.

Enjoyed this post?

Get notified when we publish new engineering deep-dives and product updates.

Ready to see the difference?

Run the free audit script in 5 minutes. Or start Shadow Mode and see HatiData run your actual workloads side-by-side.