Drift score reference

The drift score is a number between 0 and 1 that indicates how much observed behavior changed between two fingerprints (e.g. before vs after a deploy).

Scale

0 — No meaningful change.
0.2–0.3 — Common threshold for “something changed; review before shipping.”
0.5+ — Substantial change; investigate before releasing.
0.7+ — Large change (e.g. 0.73); high likelihood of user-visible impact.
1 — Maximum divergence.

What a 0.73 means

A score of 0.73 means the two fingerprints are quite different across the dimensions we measure (tool distribution, latency, decision rates, errors). It does not tell you the cause — use the optional root-cause summary or your own logs to investigate. We recommend blocking or flagging deploys above your chosen threshold (e.g. 0.25) until the change is understood.

Setting thresholds

Start with 0.2–0.3 for alerts or PR gates. Tighten (e.g. 0.15) for critical flows; loosen (e.g. 0.4) if you get too many false positives. You can set different thresholds per dimension (e.g. stricter on decision_drift, looser on latency_drift).