Engineering

Branch Isolation: Safe Exploration for AI Agents

HatiData TeamMarch 1, 20268 min read

The Exploration Problem

AI agents need to explore. A financial analysis agent might want to test what happens to a portfolio under different market scenarios. A data engineering agent might want to try different transformation strategies before committing to one. A research agent might want to pursue multiple hypotheses simultaneously.

But exploration is dangerous when it happens on production data. If the agent writes incorrect results, corrupts a table, or inserts millions of junk rows, the damage is immediate and potentially irreversible. Traditional solutions — copying entire databases, using transactions with savepoints, or running separate database instances — are either too slow, too expensive, or too complex for agent-scale workloads.

HatiData solves this with branch isolation: a lightweight mechanism for creating isolated copies of an agent's data environment that are cheap to create, fast to query, and safe to discard. Branches use the engine's native schema system, which means they have zero overhead for read-only exploration and minimal overhead for write operations.

How Branch Isolation Works

Schema-Based Isolation

Each branch in HatiData is an isolated schema with a UUID-based name: branch_{uuid}. When an agent creates a branch, HatiData creates a new schema and populates it with views that point to the tables in the main schema. No data is copied — the views are zero-cost references to the existing data.

sql

-- What HatiData does internally when creating a branch
CREATE SCHEMA branch_a1b2c3d4;

-- For each table in the main schema, create a view
CREATE VIEW branch_a1b2c3d4.customers AS
    SELECT * FROM main.customers;
CREATE VIEW branch_a1b2c3d4.orders AS
    SELECT * FROM main.orders;
CREATE VIEW branch_a1b2c3d4.products AS
    SELECT * FROM main.products;

Queries within the branch see exactly the same data as the main schema, but in a completely isolated namespace. The agent can reference customers within its branch context without any ambiguity — HatiData sets the search_path to the branch schema before executing each query.

Copy-on-Write Materialization

The zero-copy views work perfectly for read-only exploration. But when an agent writes to a table in a branch — inserting rows, updating values, or deleting records — HatiData needs to materialize that table. This is the copy-on-write step: the first write to any table triggers a full copy from the main schema into the branch schema, replacing the view with a real table.

sql

-- First write to customers in the branch triggers materialization
DROP VIEW branch_a1b2c3d4.customers;
CREATE TABLE branch_a1b2c3d4.customers AS
    SELECT * FROM main.customers;

-- Now the write can proceed on the materialized copy
INSERT INTO branch_a1b2c3d4.customers (id, name, segment)
    VALUES ('cust_new', 'Acme Corp', 'enterprise');

After materialization, the branch has its own independent copy of that table. Changes to the branch table do not affect the main schema, and changes to the main schema do not affect the branch. Tables that the agent never writes to remain as zero-cost views, pointing at the live main data.

This copy-on-write approach means that a branch exploring 1 table out of 50 only copies that 1 table. The other 49 tables consume zero additional storage.

The Branch Lifecycle

A complete branch lifecycle follows this pattern:

1Create — Agent calls branch_create with an optional description. HatiData creates the schema and populates it with views. Returns a branch ID.

1Query — Agent calls branch_query to read data within the branch. Reads go through the views for unmodified tables, through materialized copies for modified tables.

1Write — Agent writes to tables within the branch. First write to each table triggers copy-on-write materialization. Subsequent writes to the same table operate on the materialized copy.

1Merge or Discard — Agent either merges changes back to main with branch_merge, or discards the branch with branch_discard.

json

{
  "tool": "branch_create",
  "arguments": {
    "description": "Testing pricing model changes for enterprise tier"
  }
}

Response:

json

{
  "branch_id": "br_a1b2c3d4",
  "schema": "branch_a1b2c3d4",
  "tables": 12,
  "created_at": "2026-03-01T10:00:00Z"
}

Merge Strategies

When an agent is satisfied with the changes in a branch, it merges them back into the main schema. But merging is not always straightforward — the main schema may have changed since the branch was created, creating conflicts. HatiData supports four merge strategies to handle different conflict scenarios.

BranchWins

The simplest strategy: if a conflict exists, the branch version takes precedence. Any rows in the main schema that conflict with branch rows are overwritten. This is appropriate when the agent's branch work is authoritative and should supersede any concurrent changes.

MainWins

The opposite: main schema data takes precedence. Only branch changes that do not conflict with main are applied. This is useful for tentative exploration where the agent wants to keep non-conflicting additions without overwriting concurrent updates.

Manual

HatiData detects conflicts and reports them without merging. The response includes a list of conflicting rows with both the branch and main versions. A human operator (or a different agent) can then resolve each conflict individually.

Abort

If any conflicts are detected, the entire merge is aborted. No changes are applied to the main schema. This is the safest option for workflows where conflicts indicate a problem that needs investigation.

json

{
  "tool": "branch_merge",
  "arguments": {
    "branch_id": "br_a1b2c3d4",
    "strategy": "branch_wins"
  }
}

Conflict detection works at the row level using primary keys. If a row exists in both the main schema and the branch with the same primary key but different values, that is a conflict. New rows in the branch (no matching primary key in main) are always added without conflict.

Garbage Collection

Branches consume resources — schemas, materialized tables, and metadata entries. Left unchecked, abandoned branches could accumulate and waste storage. HatiData includes automatic garbage collection that cleans up unused branches.

The garbage collector tracks two metrics per branch:

Reference count — An atomic counter incremented when a query targets the branch and decremented when the query completes. A branch with zero references has no active queries.
Last accessed timestamp — Updated on every query or write to the branch.

Branches are eligible for garbage collection when they have zero active references and have not been accessed for longer than the configured TTL (default 24 hours). The collector runs periodically (default every hour) and drops eligible branch schemas with all their contents.

Garbage Collector
    |
    +── Check reference counts (per branch)
    |
    +── Check last_accessed timestamps
    |
    +── For eligible branches:
    |       DROP SCHEMA branch_{uuid} CASCADE;
    |       Remove metadata entries
    |
    +── Log cleanup summary

Agents can also explicitly discard branches with branch_discard, which bypasses the TTL and immediately cleans up the branch resources. This is the recommended approach for well-behaved agents that know when their exploration is complete.

Practical Patterns

What-If Analysis

A financial agent creates a branch, modifies pricing assumptions in a configuration table, runs revenue projections, and compares results against the main schema. If the new pricing looks better, it merges the configuration change. If not, it discards the branch and tries different assumptions.

Safe Data Transformation

A data engineering agent creates a branch, applies a series of transformations to a staging table, validates the output against quality rules, and only merges if all validations pass. The main data is never at risk during the transformation development process.

Parallel Hypothesis Testing

A research agent creates multiple branches simultaneously, each pursuing a different hypothesis. Each branch modifies data independently — adding derived columns, filtering datasets, computing aggregations. The agent queries all branches, compares results, and merges the most promising one while discarding the rest.

A/B Testing Agent Strategies

Two agent instances each get their own branch with the same starting data. They apply different strategies, and the results are compared. The winning strategy's branch is merged while the other is discarded. This is useful for evaluating agent behavior without the agents interfering with each other.

Performance Characteristics

Branch operations are designed to be fast enough for interactive agent workflows:

Branch creation: Under 10ms for schemas with up to 100 tables, since no data is copied
Read queries: Identical performance to main schema queries, since views are transparent to the query optimizer
First write (materialization): Proportional to table size, as the full table must be copied. For a 1M row table with 10 columns, expect 200-500ms.
Subsequent writes: Identical performance to main schema writes, since the table is already materialized
Merge: Proportional to the number of changed rows, not total table size. Conflict detection uses primary key indexes.
Discard: Under 5ms, as DROP SCHEMA CASCADE is very fast

Next Steps

Branch isolation works particularly well with chain-of-thought logging — every reasoning step within a branch is tagged with the branch ID, so you can replay the exact thinking that led to each branch's results. See the branch isolation documentation for configuration options, and the OpenAI branch isolation cookbook for a complete agent implementation.

Branch Isolation: Safe Exploration for AI Agents

The Exploration Problem

How Branch Isolation Works

Schema-Based Isolation

Copy-on-Write Materialization

The Branch Lifecycle

Merge Strategies

BranchWins

MainWins

Manual

Abort

Garbage Collection

Practical Patterns

What-If Analysis

Safe Data Transformation

Parallel Hypothesis Testing

A/B Testing Agent Strategies

Performance Characteristics

Next Steps

Enjoyed this post?

Ready to see the difference?