Privacy & data architecture

Driftbase is designed so that no raw user content or PII is stored. This doc explains what is captured, what is sent (if anything), and how the pipeline works.

What we do not store

Raw user inputs or prompts
Model outputs or responses
Conversation text or logs

By design, the system cannot reconstruct what users said or what the agent replied.

What we capture (and how)

At capture time, inputs and outputs are hashed (e.g. SHA-256). We also extract:

Tool names and call counts
Latency and timing
Outcome flags (e.g. escalation, error)

Only these structural and behavioral signals are used to build fingerprints. In self-hosted mode, they never leave your VPC; in cloud mode, only these non-content signals are sent to our backend.

Pipeline (high level)

Capture → hash content, extract signals → Fingerprint (aggregate over N runs) → Diff (compare two fingerprints) → Alert / audit. No step stores or transmits raw content. For a visual diagram, see the PDF one-pager linked from GDPR overview.