ToxTwin V1.3 GINEConv OGB Analysis of a Reference Anthracycline
Doxorubicin (adriamycin) is among the best-characterized anthracyclines in terms of toxicological profile. Dose-dependent cardiotoxicity, documented mutagenicity, endocrine disruption: each axis of toxicity has been the subject of decades of clinical and preclinical research. This compound therefore constitutes a demanding validation ground for a predictive model — not because it is difficult to classify, but because mechanistic concordance between predictions and experimental data can be evaluated endpoint by endpoint.
The report presented here submits doxorubicin (canonical SMILES: COc1cccc2c1C(=O)c1c(O)c3c(c(O)c1C2=O)CC(O)(C(=O)CO)CC3OC1CC(N)C(O)C(C)O1) to the ToxTwin V1.3 pipeline, built on a GINEConv OGB architecture trained on 12 Tox21 endpoints, supplemented by dedicated Ames (ICH S2(R1)) and hERG (ICH S7B) models.
The most significant result is not the detection of toxicity — expected for a cytotoxic molecule — but the structure of the predicted profile. The critical triplet SR-ARE (1.000) / SR-MMP (0.957) / Ames (0.809) constitutes a toxicological signature consistent with the anthracycline class. Oxidative stress via the quinone redox cycle, mitochondrial dysfunction through mPTP opening, and genotoxicity through intercalation and topoisomerase II inhibition form a triptych that the model reconstructs with a hierarchy consistent with experimental data.
The discrimination between mechanistic cardiotoxicity (mitochondrial/oxidative) and electrophysiological cardiotoxicity (hERG = 0.227) is clinically relevant. Doxorubicin does not cause long QT syndrome through hERG channel inhibition; its cardiotoxicity operates through accumulation of oxidative damage at the cardiomyocyte level. The model does not conflate these two pathways — an indicator of architectural coherence more informative than an aggregate AUC score.
The high scores on NR-AhR (0.750) and NR-PPAR-γ (0.750) warrant careful reading. AhR activation by doxorubicin is documented and involves CYP1A1/1B1 induction capable of modulating co-administered xenobiotic metabolism. The PPAR-γ signal aligns with recent work on cardiac lipid metabolism dysregulation as a contributing factor to anthracycline cardiomyopathy — a research axis not yet consolidated but which the model identifies autonomously.
Moderate scores on endocrine receptors (NR-AR, NR-ER, aromatase) are compatible with known clinical effects: amenorrhea, infertility, risk of hormone-dependent secondary neoplasms. The high NR-AR-LBD score (0.667) raises a pertinent question for protocols combining doxorubicin with antiandrogen therapy.
The moderate p53 score (0.395) is the most visible limitation. The p53 pathway is central to both the antitumor mechanism of action and the genotoxicity of doxorubicin. This likely underestimation reflects a structural limitation: Tox21 endpoints measure activation at fixed concentration in specific cell lines, which may not capture the magnitude of the p53 response observed in vivo.
More fundamentally, the V1.3 model operates on the parent compound. Doxorubicin is a complex metabolic substrate — doxorubicinol, 7-deoxyaglycone aglycone, radical semiquinone. Each of these metabolites has a distinct toxicological profile that the model does not capture. This limitation is intrinsic to the SMILES-in / prediction-out approach and will only be addressed with the integration of an upstream metabolic prediction module (V3.0 horizon).
This report does not constitute a toxicological discovery — the doxorubicin profile is known. It constitutes a validation case: the demonstration that the ToxTwin V1.3 pipeline, on a GINEConv OGB architecture with 2D descriptors, reconstructs a mechanistically coherent profile for a reference compound. The model’s value does not reside in its ability to predict the known, but in the structure of its predictions — a structure that will be mobilized for compounds whose toxicological profile is not yet characterized.