dbt + HatiData: dbt Adapter Transforms
Use the HatiData dbt adapter to run SQL-based data transformations in the agent data warehouse. Create models, run tests, and build data pipelines.
What You'll Build
A dbt project using the HatiData adapter for SQL-based data transformations in the agent data warehouse.
Prerequisites
$pip install dbt-core dbt-hatidata
$hati init
Architecture
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ dbt Models │───▶│ HatiData │───▶│ Engine │ │ (SQL) │ │ Adapter │ │ (execute) │ └──────────────┘ └──────────────┘ └──────────────┘ models/ → compiled SQL → proxy (:5439) → materialized tables
Key Concepts
- ●Postgres wire protocol: HatiData's proxy speaks Postgres, so any dbt Postgres-compatible adapter works out of the box
- ●Model layering: staging models clean raw data, marts models aggregate for analytics — standard dbt best practice
- ●Fast execution: SQL transformations run on HatiData's query engine for fast analytical processing
- ●Data testing: dbt tests validate data quality after every transformation run
Step-by-Step Implementation
Install the dbt Adapter
Install dbt-core and the HatiData adapter.
pip install dbt-core dbt-hatidata
hati init
dbt --versionCore:
- installed: 1.7.0
Plugins:
- hatidata: 1.0.0Note: The dbt-hatidata adapter connects to HatiData's Postgres-compatible proxy on port 5439.
Configure profiles.yml
Set up the dbt connection profile for HatiData.
# ~/.dbt/profiles.yml
hatidata_project:
target: dev
outputs:
dev:
type: hatidata
host: localhost
port: 5439
user: admin
password: ""
dbname: main
schema: analytics
threads: 4# Save to ~/.dbt/profiles.ymlNote: HatiData's proxy speaks Postgres wire protocol, so the dbt adapter uses standard Postgres connection settings.
Create dbt Models
Write SQL models that transform raw agent data into analytics-ready tables.
-- models/staging/stg_agent_events.sql
WITH raw_events AS (
SELECT
agent_id,
event_type,
payload,
created_at
FROM raw.agent_events
WHERE created_at >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT
agent_id,
event_type,
json_extract_string(payload, '$.duration_ms')::DOUBLE AS duration_ms,
json_extract_string(payload, '$.token_count')::INTEGER AS token_count,
json_extract_string(payload, '$.model')::VARCHAR AS model_name,
created_at
FROM raw_events
-- models/marts/agent_daily_summary.sql
SELECT
agent_id,
DATE_TRUNC('day', created_at) AS date,
COUNT(*) AS total_events,
AVG(duration_ms) AS avg_duration_ms,
SUM(token_count) AS total_tokens,
COUNT(CASE WHEN event_type = 'error' THEN 1 END) AS error_count,
1.0 - (COUNT(CASE WHEN event_type = 'error' THEN 1 END)::DOUBLE / COUNT(*)) AS success_rate
FROM {{ ref('stg_agent_events') }}
GROUP BY agent_id, DATE_TRUNC('day', created_at)-- Models created in models/staging/ and models/marts/Note: dbt models are pure SQL with Jinja templating. The {{ ref() }} function handles dependency ordering automatically.
Run Transformations
Execute dbt run to build all models in the correct dependency order.
dbt run --profiles-dir ~/.dbt --project-dir .
# Check the results
dbt run --select agent_daily_summaryRunning with dbt=1.7.0
Found 2 models, 0 tests, 0 snapshots, 0 seeds
Concurrency: 4 threads (target='dev')
1 of 2 START sql table model analytics.stg_agent_events ................ [RUN]
1 of 2 OK created sql table model analytics.stg_agent_events .......... [OK in 0.82s]
2 of 2 START sql table model analytics.agent_daily_summary ............ [RUN]
2 of 2 OK created sql table model analytics.agent_daily_summary ....... [OK in 0.45s]
Completed successfully
Done. PASS=2 WARN=0 ERROR=0 SKIP=0 TOTAL=2Add dbt Tests
Write data quality tests to validate the transformed data.
# models/marts/schema.yml
version: 2
models:
- name: agent_daily_summary
description: "Daily aggregated agent performance metrics"
columns:
- name: agent_id
tests:
- not_null
- name: date
tests:
- not_null
- name: success_rate
tests:
- not_null
- accepted_values:
values: [0.0, 1.0]
quote: false
config:
where: "success_rate < 0 OR success_rate > 1"
severity: error# Save to models/marts/schema.ymlNote: dbt tests run as SQL queries that assert conditions. Failed tests produce warnings or errors based on severity.
Run Tests and Generate Docs
Execute dbt tests and generate documentation.
# Run all tests
dbt test --profiles-dir ~/.dbt --project-dir .
# Generate and serve docs
dbt docs generate
dbt docs serve --port 8081Running with dbt=1.7.0
Found 2 models, 3 tests, 0 snapshots, 0 seeds
1 of 3 START test not_null_agent_daily_summary_agent_id ............... [PASS in 0.12s]
2 of 3 START test not_null_agent_daily_summary_date ................... [PASS in 0.11s]
3 of 3 START test accepted_values_agent_daily_summary_success_rate .... [PASS in 0.15s]
Completed successfully
Done. PASS=3 WARN=0 ERROR=0 SKIP=0 TOTAL=3
Catalog written to target/catalog.json
Serving docs at http://localhost:8081Note: dbt docs generate creates a browsable data catalog. Great for onboarding new team members to the data model.