Vertex AI + HatiData: Enterprise Agent Deploy
Deploy an enterprise-grade Vertex AI agent with HatiData persistent memory on Google Cloud Run. Production-ready with IAM, VPC, and monitoring.
What You'll Build
An enterprise-grade Vertex AI agent with HatiData persistent memory deployed on Google Cloud Run.
Prerequisites
$GCP account with billing enabled
$gcloud CLI installed
$hati CLI installed
$Python 3.10+
Architecture
┌──────────────┐ ┌──────────────┐ ┌───────────────┐
│ Vertex AI │───▶│ Cloud Run │───▶│ Cloud SQL │
│ Agent │ │ (HatiData) │ │ (Postgres) │
└──────────────┘ └──────────────┘ └───────────────┘
│
┌─────▼──────┐
│ Vectors │
│ (GKE) │
└────────────┘Key Concepts
- ●Cloud Run deployment: HatiData runs as a serverless container that auto-scales from 1 to N instances based on traffic
- ●Vertex AI integration: Gemini models access HatiData memory for context-aware enterprise responses
- ●CoT audit trail: every agent decision is logged to the tamper-proof ledger for compliance and debugging
- ●GCP-native security: IAM, VPC, Secret Manager, and Cloud Armor integrate with HatiData's own RBAC
- ●Production-ready: SSL, auto-scaling, monitoring, and audit logging are all enabled by default
Step-by-Step Implementation
Set Up GCP Project
Configure your GCP project and enable required APIs.
# Set project and region
export PROJECT_ID="your-project-id"
export REGION="us-central1"
gcloud config set project $PROJECT_ID
gcloud config set run/region $REGION
# Enable required APIs
gcloud services enable \
run.googleapis.com \
sqladmin.googleapis.com \
aiplatform.googleapis.com \
secretmanager.googleapis.com \
artifactregistry.googleapis.com
echo "APIs enabled for project: $PROJECT_ID"Updated property [core/project].
Updated property [run/region].
Operation "operations/..." finished successfully.
APIs enabled for project: your-project-idNote: Cloud Run, Cloud SQL, Vertex AI, Secret Manager, and Artifact Registry are all needed for the full stack.
Deploy HatiData on Cloud Run
Deploy the HatiData proxy and control plane to Cloud Run.
# Clone and build the HatiData container
git clone https://github.com/HatiOS-AI/HatiData.git
cd hatidata
# Build and push to Artifact Registry
gcloud artifacts repositories create hatidata \
--repository-format=docker \
--location=$REGION
gcloud builds submit \
--tag $REGION-docker.pkg.dev/$PROJECT_ID/hatidata/proxy:latest \
-f deploy/Dockerfile.proxy .
# Deploy to Cloud Run
gcloud run deploy hatidata \
--image $REGION-docker.pkg.dev/$PROJECT_ID/hatidata/proxy:latest \
--port 5439 \
--memory 2Gi \
--cpu 2 \
--min-instances 1 \
--max-instances 10 \
--set-env-vars "HATIDATA_DEV_MODE=false,CLOUD_PROVIDER=gcp" \
--allow-unauthenticated=false
SERVICE_URL=$(gcloud run services describe hatidata --format='value(status.url)')
echo "HatiData deployed at: $SERVICE_URL"Creating repository hatidata...
Building image...
Deploying container to Cloud Run...
Service [hatidata] revision [hatidata-00001] has been deployed.
HatiData deployed at: https://hatidata-xxxxx-uc.a.run.appNote: Cloud Run auto-scales to zero when idle. Min 1 instance keeps the proxy warm for low-latency responses.
Configure Vertex AI Agent
Create a Vertex AI agent that connects to HatiData for persistent memory.
from google.cloud import aiplatform
from hatidata_agent import HatiDataAgent
import os
# Initialize Vertex AI
aiplatform.init(
project=os.environ["PROJECT_ID"],
location=os.environ["REGION"],
)
# Connect to HatiData on Cloud Run
hati = HatiDataAgent(
host=os.environ["HATIDATA_URL"],
port=443,
agent_id="vertex-enterprise-agent",
use_ssl=True,
)
# Store enterprise context
hati.execute("""
SELECT store_memory(
'Enterprise SLA: 99.9% uptime, 15-min response for P1 incidents, 4-hour resolution target',
'enterprise-config'
)
""")
hati.execute("""
SELECT store_memory(
'Compliance requirements: SOC 2 Type II, GDPR, HIPAA BAA required for healthcare customers',
'enterprise-config'
)
""")
print("Vertex AI agent configured with HatiData memory")
print(f" Project: {os.environ['PROJECT_ID']}")
print(f" HatiData: {os.environ['HATIDATA_URL']}")Vertex AI agent configured with HatiData memory
Project: your-project-id
HatiData: https://hatidata-xxxxx-uc.a.run.appConnect Memory to Agent Conversations
Build a Vertex AI agent that uses HatiData memory for context-aware responses.
from vertexai.generative_models import GenerativeModel
model = GenerativeModel("gemini-1.5-pro")
def enterprise_agent(query: str) -> str:
# Retrieve relevant context from HatiData
memories = hati.query(f"""
SELECT content
FROM _hatidata_memory.memories
WHERE namespace = 'enterprise-config'
AND semantic_match(embedding, '{query}', 0.6)
ORDER BY semantic_rank(embedding, '{query}') DESC
LIMIT 5
""")
context = "\n".join(m["content"] for m in memories)
response = model.generate_content(
f"Enterprise context:\n{context}\n\nQuestion: {query}\nProvide a detailed enterprise-grade answer."
)
# Log reasoning to CoT ledger
hati.execute(f"""
SELECT log_reasoning_step(
'enterprise-session',
'conclusion',
'Query: {query[:100]} | Memories used: {len(memories)}'
)
""")
return response.text
# Test the agent
answer = enterprise_agent("What are our SLA requirements?")
print(answer)Our enterprise SLA commitments include:
1. 99.9% uptime guarantee across all services
2. 15-minute initial response time for P1 (critical) incidents
3. 4-hour resolution target for P1 incidents
For healthcare customers specifically, we also maintain HIPAA BAA compliance alongside SOC 2 Type II and GDPR certifications.Test Enterprise Features
Verify monitoring, audit logs, and production readiness.
# Check CoT audit trail
trace = hati.query("""
SELECT session_id, COUNT(*) AS steps,
MIN(created_at) AS started,
MAX(created_at) AS latest
FROM _hatidata_cot.traces
GROUP BY session_id
ORDER BY latest DESC
LIMIT 5
""")
print("=== Enterprise Audit Trail ===")
for t in trace:
print(f" Session: {t['session_id']}")
print(f" Steps: {t['steps']}, Started: {t['started']}")
# Check memory usage
stats = hati.query("""
SELECT namespace, COUNT(*) AS memories,
SUM(LENGTH(content)) AS total_bytes
FROM _hatidata_memory.memories
GROUP BY namespace
ORDER BY memories DESC
""")
print("\n=== Memory Usage ===")
for s in stats:
print(f" {s['namespace']}: {s['memories']} memories ({s['total_bytes']} bytes)")
print("\n=== Production Checklist ===")
print(" Cloud Run: auto-scaling 1-10 instances")
print(" SSL/TLS: enabled (Cloud Run default)")
print(" IAM: service account with least privilege")
print(" Audit: CoT ledger with cryptographic hash chain")
print(" Monitoring: Cloud Logging + Cloud Trace enabled")=== Enterprise Audit Trail ===
Session: enterprise-session
Steps: 1, Started: 2026-02-28 16:00:00
=== Memory Usage ===
enterprise-config: 2 memories (178 bytes)
=== Production Checklist ===
Cloud Run: auto-scaling 1-10 instances
SSL/TLS: enabled (Cloud Run default)
IAM: service account with least privilege
Audit: CoT ledger with cryptographic hash chain
Monitoring: Cloud Logging + Cloud Trace enabledNote: In production, add Cloud Armor WAF, VPC connector for private networking, and Cloud Monitoring alerts.