Severity of Illness Scores and Prognostication

David M. Maslove

INTRODUCTION

The ability to quickly and accurately assess a patient’s clinical status is essential to effective triage. This is especially true for patients with critical illness or injury. Severity of illness (SOI) scores help estimate the likelihood of impending clinical deterioration, identify appropriate services for consultation and admission, and enable practitioners to determine which patients will require frequent reassessment; this, in turn, helps guide time management and resource allocation.

In addition to their role in clinical assessment, disease-specific diagnostic and treatment algorithms frequently make use of SOI scores. Likewise, research trials in critical care almost always involve SOI scoring, as a means of both stratifying patients and comparing the results of one trial to another. Finally, some semblance of prognosis, even when imprecise and tentative, can be helpful in addressing the anxiety experienced by patients and families facing the uncertainty of critical illness.

To be useful in a busy emergency department (ED), an SOI score must be easy to use and its parameters should be reliable, objective, unambiguous, limited in number, and available at the time of initial assessment. This poses a challenge; easily obtained clinical parameters like vital signs are prone to disagreement between observers especially in dynamic situations when these signs fluctuate, while more objective laboratory values require additional time and resources to collect and analyze.

The simplest scoring systems use binary variables that are designated a specific cutoff value and then marked as either “present” or “absent.” Points assigned to each variable are tallied into an overall integer score that corresponds to a risk category. Traditionally, the most useful SOIs employed a small number of easily remembered parameters, allowing for rapid calculation at the point of care. Increasing adoption of smart phones and other mobile devices in the hospital setting has lessened the importance of simplicity in the scoring system, and SOIs are evolving in response to this technology.

The clinical variables included in SOI scores are determined in numerous ways, ranging from expert opinion to logistic regression. Ideally, scoring systems are derived from data describing one cohort of patients and then validated in a second, independent cohort. Additional studies are often carried out to assess a score’s validity under a range of circumstances, such as geographic location or model of health care delivery. In order to maintain score performance, updates are required as practice patterns and case mix evolve.¹

A score’s discrimination refers to its utility in distinguishing patients who experience the outcome of interest, from those who do not. Discrimination is often expressed in terms of sensitivity and specificity, or by a receiver operator characteristics (ROC) curve that relates these terms over a range of cutoff values. Scores are said to be well calibrated if they perform equally well across a range of conditions, including low- and high-risk disease, different diagnoses, and different geographical regions.²

Some SOI scores are intended for use with specific clinical presentations and diagnoses, while others are more general. In all cases, prognostic indices and SOI scores must be interpreted with caution; such tools are derived based on population averages and therefore provide only a probabilistic estimate for any given patient. For the most part, SOI scores are meant to help inform clinical decision making, which typically involves many more demographic, physiologic, and psychosocial parameters than can be distilled to a single number.

SYSTEM-SPECIFIC SOI SCORES

Pulmonary

The pneumonia severity index (PSI) for community-acquired pneumonia (CAP) is one of the most familiar disease-specific SOI scores. Also known as the PORT score, (for Pneumonia Patient Outcomes Research Team, the cohort in which it was validated), this SOI was published in 1997 and subsequently validated in several independent studies.³ Created to standardize admission practices and to identify low-risk patients suitable for home treatment, the PSI generates a score using age and 19 clinical variables recorded as either “present” or “absent.” The score, in turn, corresponds to one of five categories predicting risk of death at 30 days (Table 62.1).

TABLE 62.1 Risk Categories in the Pneumonia Severity Index

^aCategory I is assigned to patients <50 years of age, with none of the specified coexisting conditions or physical exam findings.

Fine MJ, Auble TE, Yealy DM, et al. A prediction rule to identify low-risk patients with community-acquired pneumonia. N Engl J Med. 1997;336:243–250.

Age and comorbidities weigh heavily in the PSI, predisposing the score to overestimate severity in elderly patients with chronic illness and to underestimate severity in young and otherwise healthy patients.⁴ In one validation study, only 20% of patients in the highest-risk class (V) were admitted to the ICU, proving that PSI is less useful in prognosticating for ICU admission than for hospital admission.⁵ Patients with HIV were excluded from the initial PSI study, and the index was shown to markedly underestimate disease severity in patients with pandemic influenza A(H1N1) during the 2009 outbreak.⁶ In a meta-analysis involving 16,519 patients, the PSI was found to be sensitive (pooled sensitivity 90%), but lacked specificity (pooled specificity 53%).⁷

With 20 variables to account for, the PSI can be cumbersome to use. A simpler score developed by the British Thoracic Society known as CURB-65 uses only five clinical parameters: confusion, blood urea nitrogen (BUN) level, respiratory rate (RR), blood pressure, and age.⁸ One point is assigned for each variable, depending on whether it is present or absent according to a specified cutoff value (Table 62.2). As in the PSI, the total score is then used to assign a risk category that predicts mortality at 30 days. The CURB-65 score is less sensitive than is the PSI (pooled sensitivity 62%), but is more specific (pooled specificity 79%).⁷ Other versions of the CURB-65 score include CURB, in which age is omitted, and CRB-65, which does not require the laboratory value of BUN. The exclusion of BUN leads to a decrement in sensitivity (pooled sensitivity 33%), but improves specificity (pooled specificity 92%).⁷ Importantly, the original CURB cohorts excluded nursing home residents as well as immunocompromised patients including those with malignancy, HIV, and tuberculosis.

TABLE 62.2 CURB-65 Score

^aMental Test Score of 8 or less or new disorientation in person, place, or time.

Lim WS, Van der Eerden MM, Laing R, et al. Defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study. Thorax. 2003;58:377–382.

Like the PSI, the CURB-65 score performs poorly in predicting the need for ICU admission. Because delayed ICU admission increases mortality risk in patients with severe CAP, the SMART-COP score was designed to address this issue specifically. This score combines eight clinical characteristics to estimate the risk of requiring intensive respiratory support (either invasive or noninvasive mechanical ventilation) or infusions of vasopressors and can therefore be useful in assigning patients to the appropriate level of care (Table 62.3).⁹ A SMART-COP score of ≥3 was found to be more sensitive for the need for ICU-level support than was PSI class IV, PSI class V, or CURB-65 risk category 3 (92.3% vs. 73.6% vs. 38.5%, respectively). ATS/IDSA guidelines on CAP management also offer ICU admission criteria, including the need for invasive mechanical ventilation, septic shock with the need for vasopressors, or any three of a set of minor criteria similar to those used in the aforementioned CAP scores.¹⁰

TABLE 62.3 Smart-Cop Score

Total score used to predict risk of needing intensive respiratory or vasopressor support.
0–2 points = Low risk.
3–4 points = Moderate risk (1 in 8).
5–6 points = High risk (1 in 3).
≥7 points = Very high risk (2 in 3).

Charles PGP, Wolfe R, Whitby M, et al. SMART-COP: a tool for predicting the need for intensive respiratory or vasopressor support in community-acquired pneumonia. Clin Infect Dis. 2008;47:375–384.

Neurologic

In critical neurologic conditions such as subarachnoid hemorrhage (SAH), ischemic stroke, and traumatic brain injury, SOI scores—based on both clinical and imaging characteristics—can be used to estimate prognosis and, in some cases, inform treatment decisions.

Coma

First published in the mid-1970s, the Glasgow Coma Scale (GCS) was initially developed to standardize descriptions of coma. Later, it was modified specifically to evaluate level of consciousness following traumatic brain injury.¹¹ To calculate the score, points are added for the patient’s eye, verbal, and motor responses. Scores range from 3 to 15, with lower scores indicating greater severity of injury (Table 62.4).

TABLE 62.4 Glasgow Coma Scale

Sternbach GL. The Glasgow coma scale. J Emerg Med. 2000;19:67–71.

Although GCS can be reported as a single sum, this may be less informative than an explicit breakdown of the constituent parts.¹¹ Common confounders include sedation, analgesia, neuromuscular blockade, delirium, orbital trauma, and intubation, each of which can make it impossible to calculate one or more of the subscores.¹² In intubated patients, for example, the verbal score is often represented by the letter “T,” which provides information, but precludes calculation of a total score.¹³ Alternative scoring systems, such as the Full Outline of UnResponsiveness (FOUR) score, may be more appropriate in critically ill intubated patients.¹⁴

In the prehospital setting, GCS is predictive of both death and hospitalization. A GCS of ≤13 in the field is an indication for immediate transport to a specialized trauma center.¹⁵ GCS calculated at ED admission is an independent predictor of mortality¹⁶ as well as of functional status at 6 months.¹⁷ In some studies of the GCS the motor component alone has been shown to correlate with mortality.¹⁶

In the GCS system, traumatic brain injury is classified as mild (GCS 13 to 15), moderate (GCS 9 to 12), or severe (GCS < 9).¹² A GCS score of 8 or less is often cited as an indication for intubation. Current guidelines from the Eastern Association for the Surgery of Trauma recommend endotracheal intubation for patients with GCS ≤ 8, but note that patients with altered mental status and a GCS > 8 often require intubation as well.¹⁸ Airway obstruction, persistent hypoxemia, and hypoventilation should trigger prompt intubation regardless of mental status.

The GCS is likely the most widely used mental status score in the ICU.¹¹^,¹⁹ It is easily calculated at the bedside and can be repeatedly measured as a means of tracking the progression of injury and recovery. Interrater agreement depends on provider type and level of experience and is highest when scores are high.¹¹ The GCS has become integral to other more recently developed SOI scoring systems, including the Acute Physiology and Chronic Health Evaluation (APACHE) and Simplified Acute Physiology Score (SAPS) systems discussed below.

Subarachnoid Hemorrhage

Numerous SOI scores exist for SAH, although most are derived from expert opinion and have only been validated in small cohorts.²⁰ The most frequently used are the Hunt and Hess scale and the World Federation of Neurological Surgeons (WFNS) scale, which are based on clinical parameters, as well as the Fisher scale, based on computerized tomography (CT) imaging (Table 62.5).

TABLE 62.5 Common Scales Used in SAH²⁵

Ferro JM, Canhão P, Peralta R. Update on subarachnoid haemorrhage. J Neurol. 2008;255:465–479.

The Hunt and Hess grading system can be difficult to apply consistently; some of its terms are ambiguous, and clinical findings have the potential to span multiple categories. Interrater agreement in applying the score is moderate (κ = 0.48).²¹ The score defines five classes, with a sixth (Hunt and Hess 0) sometimes included for patients with unruptured aneurysms. The Hunt and Hess scale is poorly powered to predict distinct outcomes for each individual class, and as such, classes are sometimes aggregated: Patients are often grouped into low scores (classes 0 to III) versus high scores (classes IV and V) or to “alert” (classes I and II), “drowsy” (classes III and IV), and “comatose” (class V).²⁰^,²² The WFNS comprises a condensed version of the GCS and an additional binary measure for the presence or absence of a focal motor deficit. Its prognostic value is unclear; some studies suggest it correlates with outcome, while others do not.²⁰^,²² The Fisher grading system uses CT findings and was initially established to predict the risk of vasospasm; it also has been shown to correlate with outcomes at 1 year and beyond. Patients in Fisher class 3 and 4 have an increased risk of poor outcome or death (relative risk 3.2 to 14.8).²³ Fisher class does not, however, accurately predict long term health-related quality of life.24 The GCS has also been shown to correlate with outcomes in SAH.²⁰

Ischemic Stroke

The National Institutes of Health Stroke Scale (NIHSS) is an 11-part evaluation of neurologic signs and is used for triage and prognostication of ischemic stroke. It incorporates measures of level of consciousness, gaze, visual fields, motor function, ataxia, sensation, speech, language, and neglect. The NIHSS has been shown to correlate with survival, length of stay, discharge destination, and functional status at 1 year.²⁶ It has been used to identify patients who are appropriate candidates for thrombolytic therapy, with both very high-scoring and very low-scoring patients deemed not suitable for treatment. Patients with profound deficits isolated to a single component of the scale, such as severe aphasia, may score low but should be considered for thrombolysis nonetheless.²⁷

Gastrointestinal

Devised in the 1970s to predict complications of acute pancreatitis, the Ranson score is an early example of a disease-specific SOI score.³² Its use has largely been supplanted by more generalized scoring systems such as APACHE and Sequential Organ Failure Score (SOFA), reflecting the propensity of severe pancreatitis to result in multiple organ dysfunction.³³

Establishing risk in acute gastrointestinal bleeding can be useful in determining which patients require hospital admission and urgent endoscopy. The Rockall score incorporates age, comorbidities, and the presence of shock to stratify patients according to risk of rebleeding and death.³⁴ The Glasgow-Blatchford score (GBS) incorporates features of the presentation (melena, syncope), along with heart rate, blood pressure, hemoglobin, BUN, and the presence of cardiac or hepatic disease to derive an integer score.³⁵ The GBS is predictive of a composite endpoint that includes death; rebleeding; and the need for blood transfusion, endoscopy, or surgery. It has been shown to outperform the Rockall score in a number of prospective evaluations, with an area under the (ROC) of approximately 0.9.³⁵^–³⁷

In patients with acute liver failure (ALF), SOI scoring has been used to estimate the risk of death, so that referral for transplant can be initiated if indicated. The King’s College criteria (Table 62.6), developed in the United Kingdom, distinguish between ALF resulting from acetaminophen toxicity and ALF resulting from other causes, many of which portend a worse prognosis.³⁸ In general, the King’s College criteria predict mortality with specificity of approximately 90%, but sensitivity of only approximately 60%.³⁹^,⁴⁰ This limits the utility of the score somewhat, as many patients who do not meet criteria should still be considered for transplant.⁴¹ The Model for End-Stage Liver Disease (MELD) score is a mathematical combination of the serum bilirubin, creatinine, and INR and is used to evaluate 3-month mortality risk in chronic liver disease. MELD has also been applied to patients with AFL, with a recent prospective analysis showing it to be a better predictor of death than the King’s College criteria.⁴² In particular, the MELD score improved upon the poor negative predictive value of the King’s College criteria, as 20 of the 22 patients who survived without transplantation had a MELD score ≤ 30.

TABLE 62.6 King’s College Criteria for Liver Transplantation in Acute Liver Failure

Gotthardt D, Riediger C, Weiss KH, et al. Fulminant hepatic failure: etiology and indications for liver transplantation. Nephrol Dial Transplant. 2007;22:viii5–viii8.