Engineering

Shadow Mode: Zero-Risk Data Warehouse Migration

HatiData TeamMarch 4, 20266 min read

The Migration Fear

Data warehouse migrations fail for predictable reasons. The new system handles 95% of queries perfectly, but the other 5% produce subtly different results — rounding differences in floating point arithmetic, different null handling semantics, timezone conversion discrepancies, or edge cases in SQL dialect translation. These differences are impossible to discover through testing alone. They only surface in production, weeks or months after the migration, when someone notices that a report no longer matches expectations.

This fear keeps organizations on legacy infrastructure long after they have decided to move. The technical evaluation is complete, the cost analysis is favorable, but the migration itself is too risky. What if something breaks? What if the numbers do not match? What if we have to roll back after agents are already running on the new system?

Shadow Mode eliminates this fear by removing the need for a hard cutover. Instead of migrating all at once, you run both systems simultaneously and let the data prove that the new system is ready.

How Shadow Mode Works

Shadow Mode operates as a transparent proxy between your agents and your data infrastructure. Every query your agents send is executed twice — once against your existing warehouse and once against HatiData — and the results are automatically compared.

Agent sends query
    |
    v
Shadow Mode Proxy
    |
    +──> Existing Warehouse ──> Results A
    |
    +──> HatiData ──> Results B
    |
    v
Comparison Engine
    |
    +── Match: Results identical (or within tolerance)
    +── Mismatch: Differences logged with full context
    |
    v
Return Results A to Agent (existing warehouse is still primary)

The critical property: agents always receive results from the existing warehouse. HatiData's results are used purely for comparison. This means Shadow Mode has zero impact on your production workloads — if HatiData produces different results, the agent never sees them. The differences are logged for analysis, but the agent continues operating normally on the existing system.

Setting Up Shadow Mode

Shadow Mode requires three things:

1Your existing warehouse connection — The connection string for your current database, with read-only credentials
2A HatiData instance — Either a local installation or a cloud deployment
3Data sync — Your data replicated to HatiData (via the CLI sync tools or direct import)

The proxy is configured with a simple YAML file:

yaml

shadow_mode:
  enabled: true
  primary: existing_warehouse
  shadow: hatidata
  comparison:
    float_tolerance: 0.0001
    null_equality: true
    order_sensitive: false
    max_rows_to_compare: 10000

existing_warehouse:
  type: postgres
  host: your-warehouse.example.com
  port: 5432
  database: production
  user: readonly_user

hatidata:
  host: localhost
  port: 5439
  user: admin

The Comparison Engine

Not all result differences are meaningful. Shadow Mode's comparison engine understands the common categories of harmless differences and filters them from the mismatch reports.

Floating Point Tolerance

Different databases compute floating point arithmetic with slightly different precision. A SUM operation might return 1234567.890001 from one system and 1234567.890002 from another. Shadow Mode applies a configurable tolerance (default 0.0001) to floating point comparisons, marking results as matching if they are within tolerance.

Row Ordering

When a query does not include an ORDER BY clause, the row order is undefined. Different databases return rows in different orders based on internal storage layout and query execution strategy. Shadow Mode compares result sets without regard to order by default, only checking order when the query explicitly includes ORDER BY.

Null Handling

Different SQL engines handle NULL comparisons differently in some edge cases. Shadow Mode provides a null_equality flag that controls whether two NULL values are considered equal in comparisons.

Type Coercion

A column might be returned as INTEGER from one system and BIGINT from another, or as FLOAT32 vs FLOAT64. Shadow Mode compares values after normalizing to common types, so type differences that do not affect value accuracy are not flagged as mismatches.

The Migration Dashboard

Shadow Mode includes a dedicated dashboard view that tracks comparison results over time:

Match rate — Percentage of queries that produce identical results (the target is 100%)
Mismatch categories — Breakdown of differences by type (value, type, row count, column count, error)
Latency comparison — Side-by-side latency for each query on both systems
Trending — Match rate over time, showing whether the rate is improving as dialect issues are resolved
Mismatch drill-down — For each mismatched query, the exact SQL, both result sets, and highlighted differences

The dashboard makes it easy to identify which queries need attention. Typically, the first week of Shadow Mode reveals a small number of SQL dialect issues that need transpiler fixes. Once those are resolved, the match rate approaches 100% and stays there.

Migration Phases

Phase 1: Shadow (Week 1-2)

Enable Shadow Mode with your existing warehouse as primary. All agents continue using the existing warehouse. HatiData runs every query in parallel and results are compared. Fix any dialect or transpilation issues that cause mismatches.

Phase 2: Validation (Week 2-3)

Once the match rate has been 100% for at least 7 consecutive days, you have statistical confidence that HatiData produces identical results for your workload. Review the latency comparison to ensure HatiData meets your performance requirements.

Phase 3: Flip (Day 1 of migration)

Switch the primary to HatiData. Agents now receive results from HatiData, and the existing warehouse becomes the shadow. This is a single configuration change — no agent code modifications needed.

Phase 4: Verify (Week 1 post-flip)

Run with HatiData as primary and your existing warehouse as shadow for at least one week. Confirm that all agents continue operating correctly. The shadow comparison provides an automatic safety net — if any query produces different results, you have immediate visibility.

Phase 5: Decommission

Once you are confident in the migration, disable Shadow Mode and decommission the existing warehouse connection. The migration is complete.

Zero Downtime Guarantees

At no point during the Shadow Mode migration does any agent experience downtime. The proxy handles the primary/shadow flip transparently:

No connection string changes for agents
No query syntax changes
No schema modifications
No application code changes
No deployment coordination

The agents are unaware that a migration is happening. They send the same queries to the same endpoint and receive the same results. The only change is which system is providing those results.

Performance Impact

Shadow Mode does add overhead — every query runs twice, and the comparison engine processes both result sets. However, this overhead affects latency, not correctness:

Latency — Agents experience the latency of the primary system only. Shadow execution is asynchronous and does not block the response.
Network — Additional bandwidth for the shadow queries. For most workloads, this is negligible.
Compute — The shadow system (HatiData) consumes compute resources for its parallel execution. This is the cost of validation.

For cost-sensitive environments, Shadow Mode supports sampling — only a configurable percentage of queries (e.g., 10%) are executed on both systems. This reduces the compute cost while still providing statistical confidence in result equivalence.

Next Steps

Shadow Mode is available on all HatiData tiers, including the free local installation. To start a Shadow Mode evaluation, install HatiData locally, sync your data with the CLI, and configure the shadow proxy. The entire setup takes under an hour for most workloads.