Hospital-wide sepsis detection: A machine learning model based on prospectively expert-validated cohort

dc.contributor.authorBorges-Sa, Marcio
dc.contributor.authorGiglio, Andrés
dc.contributor.authorAranda, Maria
dc.contributor.authorSocias, Antonia
dc.contributor.authordel Castillo, Alberto
dc.contributor.authorPruenza, Cristina
dc.contributor.authorHernández, Gonzalo
dc.contributor.authorCerdá, Sofía
dc.contributor.authorSocias, Lorenzo
dc.contributor.authorEstrada, Victor
dc.contributor.authorde la Rica, Roberto
dc.contributor.authorMartin, Elisa
dc.contributor.authorMartin-Loeches, Ignacio
dc.coverage.spatialSuiza
dc.date.accessioned2026-01-26T15:35:53Z
dc.date.available2026-01-26T15:35:53Z
dc.date.issued2026-01-21
dc.description.abstractBackground/Objectives: Sepsis detection remains challenging due to clinical heterogeneity and limitations of traditional scoring systems. This study developed and validated a hospital-wide machine learning model for sepsis detection using retrospectively developed data from prospectively expert-validated cases, aiming to improve diagnostic accuracy beyond conventional approaches. Methods: This retrospective cohort study analysed 218,715 hospital episodes (2014–2018) at a tertiary care centre. Sepsis cases (n = 11,864, 5.42%) were prospectively validated in real-time by a Multidisciplinary Sepsis Unit using modified Sepsis-2 criteria with organ dysfunction. The model integrated structured data (26.95%) and unstructured clinical notes (73.04%) extracted via natural language processing from 2829 variables, selecting 230 relevant predictors. Thirty models including random forests, support vector machines, neural networks, and gradient boosting were developed and evaluated. The dataset was randomly split (5/7 training, 2/7 testing) with preserved patient-level independence. Results: The BiAlert Sepsis model (random forest + Sepsis-2 ensemble) achieved an AUC-ROC of 0.95, sensitivity of 0.93, and specificity of 0.84, significantly outperforming traditional approaches. Compared to the best rule-based method (Sepsis-2 + qSOFA, AUC-ROC 0.90), BiAlert reduced false positives by 39.6% (13.10% vs. 21.70%, p < 0.01). Novel predictors included eosinopenia and hypoalbuminemia, while traditional variables (MAP, GCS, platelets) showed minimal univariate association. The model received European Medicines Agency approval as a medical device in June 2024. Conclusions: This hospital-wide machine learning model, trained on prospectively expert-validated cases and integrating extensive NLP-derived features, demonstrates superior sepsis detection performance compared to conventional scoring systems. External validation and prospective clinical impact studies are needed before widespread implementation.
dc.identifier.citationJournal of Clinical Medicine, Vol. 15, N° 2 (2026) p. 1-17
dc.identifier.doihttps://doi.org/10.3390/jcm15020855
dc.identifier.issne2077-0383
dc.identifier.orcidhttps://orcid.org/0000-0002-0533-4531
dc.identifier.urihttps://hdl.handle.net/20.500.12254/7455
dc.language.isoen
dc.publisherMDPI
dc.rightsAtribución-NoComercial-CompartirIgual 3.0 Chile (CC BY-NC-SA 3.0 CL)
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/cl/
dc.titleHospital-wide sepsis detection: A machine learning model based on prospectively expert-validated cohort
Archivos
Bloque original
Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
jcm-15-00855.pdf
Tamaño:
782.85 KB
Formato:
Adobe Portable Document Format
Bloque de licencias
Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
license.txt
Tamaño:
347 B
Formato:
Item-specific license agreed upon to submission
Descripción: