Privacy & data architecture
Driftbase is designed so that no raw user content or PII is stored. This doc explains what is captured, what is sent (if anything), and how the pipeline works.
What we do not store
- Raw user inputs or prompts
- Model outputs or responses
- Conversation text or logs
By design, the system cannot reconstruct what users said or what the agent replied.
What we capture (and how)
At capture time, inputs and outputs are hashed (e.g. SHA-256). We also extract:
- Tool names and call counts
- Latency and timing
- Outcome flags (e.g. escalation, error)
Only these structural and behavioral signals are used to build fingerprints. In self-hosted mode, they never leave your VPC; in cloud mode, only these non-content signals are sent to our backend.
Pipeline (high level)
Capture → hash content, extract signals → Fingerprint (aggregate over N runs) → Diff (compare two fingerprints) → Alert / audit. No step stores or transmits raw content. For a visual diagram, see the PDF one-pager linked from GDPR overview.