Axis I · D3
System lifecycle and degradation
Thesis. Degradation is not an accident to be avoided but an operational regime to be governed. Every AI system in production drifts; the operational question is the latency between drift and revalidation, not whether drift will occur.
The distinction that cuts
Model availability vs system reliability. A model can be available 99.99 % of the time while being structurally unreliable for the past three months. The SLO for the first does not cover the second.
Typical market error
Importing SRE doctrine without translating it. MTTR, error budget and post-mortem function for deterministic systems; they capture neither distributional drift, nor concept drift, nor population drift. A dashboard reporting 200 OK while a model predicts wrong passes the technical audit and fails the clinical one.
Failure signals
No statistical baseline for production inputs (PSI, KS, Wasserstein not computed). No revalidation procedure triggerable outside scheduled releases. No long-term history of calibrated predictions: at least six months are needed for concept drift to become legible. No documented withhold policy on threshold breach. Confusion between infrastructure alert and decisional reliability alert in the same on-call channel.
References
ISO/IEC 25059 (quality of AI systems, extension of 25010); FDA Predetermined Change Control Plans, Final Guidance, December 2024; EMA Reflection paper on the use of AI in the medicinal product lifecycle (2024); concept drift literature, notably Gama et al., A survey on concept drift adaptation, ACM CSur, 2014; MDCG 2019-11 for SaMD qualification.
Ground of implementation
ToxTwin V2.4 implements isotonic regression calibration, monitored applicability domain, frozen holdout (SHA256 published) serving as a versioning stability reference. The V1.3 to V2.3 to V2.4 trajectory, including correction of the GINConv/GINEConv bug and hexagonal refactor, constitutes a documented chronicle of degradation and revalidation. The instance illustrates a versioning-validation discipline; it does not prove this discipline is sufficient in regulated environments without a formal PCCP procedure on the regulator side.
Articulation
Logical continuum with D1: an architecture governable at deployment must remain governable through time, otherwise it never was. Continuum with D7, whose metrology feeds the detection system.