lasso logistic regression model

Sepsis is a word that can evoke feelings of dread for most people and yet, many fail to understand its meaning and implications fully. It is a life-threatening condition that arises when the body’s response to infection results in organ dysfunction or failure. It is one of the leading causes of deaths with 6 to 9 million deaths per year worldwide and 250,000 deaths per year in the United States. To make matters worse, sepsis is also one of the leading causes of preventable deaths worldwide. The key to prevention is early detection since every hour of delay increases the odds of death by 20%. However, sepsis can be very hard to detect especially in its early stages when it is reversible. The symptoms often point to other conditions and patients can deteriorate rapidly. Traditional risk models such as SIRS criteria to detect sepsis generate a lot of false positives leading to inefficient or ineffective care and nursing fatigue.

Leveraging Machine Learning

Due to PCCI’s close collaboration with Parkland Health & Hospital System, we recognized sepsis detection and prevention as an area that could benefit from an advanced machine learning-based algorithm. To be effective, the model needed to fulfill the following criteria:

A high enough positive predictive value (PPV) to ensure that false positives do not end up causing frustration for clinicians. It is very challenging to have a high PPV for use cases with low prevalence rates in the target population. Despite the high mortality numbers, the overall prevalence rate for sepsis in an inpatient setting (our target population) is 3% to 4%.
A prediction interval (how far ahead in the future can the model predict) that gives clinicians a sufficient heads-up to intervene.

PCCI developed a predictive model to fulfill the criteria identified as effective. This real-time model is designed to predict an individual’s risk of becoming septic within the next 12 hours.

Instead of relying on traditional ways of detecting the risk of sepsis through a handful of physiologic variables, we cast our net wide and started out with approximately 120 variables as potential predictors, such as socio-demographic, vital signs, co-morbidities, hospital utilization, medical conditions, clinical history, and lab results.

Testing the Lasso Logistic Regression Model

We trained and tested a Lasso Logistic Regression model, a decision tree and a neural net on 27 months-worth of Inpatient encounters from Parkland. Initial data sets included 54,629 encounters and 30,922 patients. We chose the Lasso Logistic Regression model as the final model because of its high performance and interpretability by clinical users. The resulting model that compared favorably to other community models resulted in:

95% Accuracy
30% PPV
55% Sensitivity

Incorporating the Model into Clinical Workflows

A model is useful only if it is actionable and available at point of care. PCCI’s Sepsis model is incorporated into clinical workflows through industry-standard APIs. The model accesses EHR data in real-time every five minutes and alerts the clinician if the risk is above a certain threshold. The risk threshold can be tailored to local populations and clinical best practices of an organization. The alerting mechanism can vary based on EHR functionality, but it can include a simple alert, queuing up of order sets based on the risk threshold, and sending a real-time alert to a secure mobile device or pager for a member of the rapid response team.

Learn more about PCCI’s work, or stay up-to-date with our recent news by following us on Facebook, Twitter and LinkedIn!