The measurement of health is central to the evaluation of health care. Until the first part of the 20th century, health was defined as the absence of disease and was measured in terms of morbidity and mortality. This simple approach to health status was rejected in 1948 with the expansion of the concept of health by the World Health Organization (WHO), which defined health as “A state of complete physical, mental and social wellbeing and not merely the absence of disease or infirmity.”1 This definition reflected the multidimensionality of health and considered not only biologic markers but also the ability to perform physically, psychologically, and socially in the everyday environment.
This change in the definition of health gave rise to the current outcomes movement. Other factors, such as the reversal of the proportion of care rendered for acute illnesses versus chronic diseases, technological advancements in health care, rising health care costs, the emerging concept of quality of care, and increased recognition of the importance of patients’ views about their care and health, have further fostered the growth of this movement. Conscientious health care providers use both individual clinical expertise and the best available objective, external evidence for treatment. The best available external evidence for treatment is defined as clinically relevant research, often from the basic sciences of medicine but especially from patient-based clinical research into the accuracy and precision of diagnostic tests (including the clinical examination); the power of prognostic markers; and the efficacy and safety of therapeutic, rehabilitative, and preventive regimens.2 The integration of clinical expertise and best available external evidence for treatment is the practice of evidence-based medicine. With the plethora of current and relevant literature, it is impossible for most clinical care providers to keep abreast of the latest developments in their field. Because of this problem, structured approaches to literature synthesis, such as that organized by the Cochrane Collaboration, have arisen to summarize, with the least possible bias, the best available research on a specific topic (see later discussion). The goal is to make relevant information widely available and evidence-based practice unencumbered for all health care providers.
This chapter focuses on common terminology used in outcomes assessment and on outcomes measurement tools used in research on and treatment of patients with chronic noncancer pain because they are part of the foundation of what will become evidence-based practice and perhaps eventually guidelines for treatment. Although the comprehensiveness and validity of outcome measures for the treatment of all types of pain lag behind those of equally high-impact conditions that affect the public’s health, this lag is even more pronounced for cancer pain than for noncancer pain.3 Much more research has addressed functional assessment, and how pain management influences function, in patients with acute or chronic noncancer pain than in those whose pain results from malignancy.4
As health care providers, we treat patients to make them “better.” How is “better” defined, and by whom? “Better” from the point of view of the practitioner, the patient, or society? Does “better” equate to less pain, increased physical functioning, decreased disability (as judged by a physical therapist), improved quality of life (as judged by the patient), or decreased cost of worker’s compensation charges and fewer health care visits (as judged by payers)? Does the same intervention that benefits one patient benefit a group of patients with similar conditions? How do we know whether it does? These are questions that the outcomes assessment movement is trying to address.
Outcomes research studies the results of medical care.5 It involves “the rigorous determination of what works in medical care and what does not” and states that “outcomes research, by informing the content of policy positions, payment rules, and practice guidelines, presumably solves both the problems of quality and cost that beset health care and does so by scientific rather than political means” (p. 1268).6 Outcomes research is the foundation of evaluation of the quality and costs of health care delivery. Adoption of an evidence-based approach to health care, exemplified by the Cochrane Collaboration,2,7,8 has been accompanied by a shift toward an emphasis on patient-centered health outcomes.9 This broadened perspective has heightened the need for tools to monitor and adjust treatment and to approach clinical decision making from a viewpoint that is evidence based and patient centered.10 The pressing need to know which treatments reduce chronic pain; which improve functional status (including return to work and social activities); whether they change pain intensity; and, in particular, which treatments are worth paying for has fueled the development of a number of instruments. These instruments are intended to capture in a simple, speedy, and robust fashion the health status of patients.11
This section discusses common terminology used in outcomes assessment and provides several examples of assessment tools used in measuring outcomes during the treatment of patients with chronic noncancer pain. Health assessments focus on three broad categories of measures: traditional biologic, general (or generic), and disease specific.12 Traditional biologic measures may be primary, such as morbidity and mortality, or surrogate, such as a decrease in blood pressure in patients given an antihypertensive drug. Measures used for patient-centered outcomes generally estimate persons’ health-related quality of life (HRQOL) and their ability to function and to do the things they want to do. These measures may be generic, evaluating overall health status, or disease specific, focusing on the effect of a given condition on a person’s life.
HRQOL assessment is the measurement or evaluation of the health of an individual or a patient. HRQOL may include biologic markers, but it emphasizes indicators of physical functioning; mental health; social functioning; and other health-related concepts, such as pain, fatigue, and perceived well-being.13 Concepts included in some commonly applied HRQOL instruments are presented in Table 106-1.
Domains Used in Health-Related Quality of Life Measurements
Domains | QWB | SIP | NHP | QLI | COOP | EQ-5D | DUKE | MOS SF-36 |
Physical functioning | X | X | X | X | X | X | X | X |
Social functioning | X | X | X | X | X | X | X | X |
Role functioning | X | X | X | X | X | X | X | X |
Psychological distress |
| X | X | X | X | X | X | X |
Health perceptions (general) |
|
| X | X | X | X | X | X |
Pain (bodily) |
| X | X |
| X | X | X | X |
Energy/fatigue | X |
| X |
|
|
| X | X |
Psychological well-being |
|
|
|
|
|
| X | X |
Sleep |
| X | X |
|
|
| X |
|
Cognitive functioning |
| X |
|
|
| X |
|
|
Quality of life |
|
|
|
| X |
|
|
|
Reported health transition |
|
|
|
| X |
|
|
|
Quality of life includes HRQOL but is a broader term that includes nonmedical aspects of life that reflect the aggregate impact of food, shelter, safety, living standards, and social and physical environmental factors.13
Patient-based outcome measures are indicators of patients’ evaluations of both changes in patient health status, including HRQOL and mortality, and the quality of health care. The importance of patients’ views has been increasingly recognized in health care.14 One might even argue that the increased interest in palliative care and pain control in recent years is the direct result of a power shift in which patients and their families—the consumers of health care—have much greater autonomy and power than under the previous disease-centered model of care.15 Clinicians’ taking patients’ views into account is associated with greater patient satisfaction with care,16 better compliance with treatment programs,17 and an increased likelihood of maintaining a continuous relationship during health care.18
The distinction between disease-based clinical investigation and patient-centered outcomes research is analogous to that between measures of efficacy and measures of effectiveness. In an ideal setting, such as a randomized, controlled clinical trial, the efficacy of a treatment may be derived from the dose-response relationship for a given physiologic effect assessed under well-controlled conditions. In controlled trials, the end points of interest are usually biologic measures, such as changes in blood glucose levels or blood pressure. However, equally important to practitioners and patients is the effectiveness of a treatment, which refers to the outcomes of this treatment when applied in typical practice settings, measured over the course of disease, and including measures that matter most to patients (i.e., patient-centered outcomes).19 Outcomes research is more likely to be generalizable to typical medical practice than are controlled clinical trials. Terminology commonly used in outcomes research is presented in Table 106-2.
Terms and Definitions Commonly Used in Outcomes Assessment
Term | Definition |
Item | A single question (e.g., “In general, how would you say your health is?”) |
Scale | A range of available responses to an item Can be categorical (e.g., excellent, very good, good, fair, poor), be numerical, or consist of a visual analog scale |
Domain | Identifies a particular focus of attention (e.g., physical functioning, mental or general health, patient satisfaction with care) and may consist of the response to a single item or responses to several related items May consist of one scale (a collection of related items) or multiple scales |
Instrument | A group of items used for the collection of desired data May contain a single item or multiple items that may or may not be divided into domains |
Domain- or dimension-specific instrument | A one-scale instrument (e.g., the McGill Pain Questionnaire) |
Ceiling or floor effect | Indicates the lack of sensitivity of an instrument to discriminate differences at the higher or lower end of a scale used to measure this effect (e.g., a ceiling effect may be a 10/10 pain intensity that is now reported as a 12/10 by a patient) |
Disease- or condition-specific tools | Instruments used exclusively for assessment of the health status of populations with a specific disease or condition (e.g., back pain, postherpetic neuralgia) |
Generic HRQOL tools | Instruments that estimate an individual’s overall health status that can be used to compare HRQOL among groups of patients with different diseases |
Because the purpose of this overview is to present a few widely applied outcomes measurement tools and the context in which they are used, we next describe the criteria used to select one from among available instruments rather than how to create a new questionnaire.
Selection of a specific outcomes tool will depend on the population of interest and the ability of the measurement tool to detect changes within the domain of interest. The selection of an instrument consists of two phases. The first has to do with the condition(s) for which this instrument will be used; the second has to do with the psychometric properties of the instrument.
Choosing a domain-specific, condition-specific, or generic instrument depends on the aim of the study. If one specific domain is of interest, such as pain intensity or depression, a domain-specific instrument can be used (e.g., the McGill Pain Questionnaire [MPQ] or the Beck Depression Inventory). In general, a condition- (or disease-) specific instrument will have a narrow focus but will provide considerable detail in the area of interest. If the interest is in general HRQOL, comparison with different conditions, or comparison with healthy people, a generic instrument can be used. Generic and condition-specific HRQOL instruments can be used together to supplement the information collected.20 Using a condition-specific survey or module together with a generic scale may provide more insight into aspects of health that are not well measured by either type of instrument.21–23 Comparison of the impact of pain on health status with the impact of other chronic illnesses on general health status, for example, allows researchers to conduct trials of various treatments so as to make clinical decisions in medical practice and inform health care policy.12
Test–retest reliability: the extent to which the measure generates consistent results. How closely do the results of repeated applications agree with each other?
Internal reliability: (quantitated by Cronbach α) the sensitivity of the number of items that make up the measure and the degree of intercorrelation between the items. A Cronbach α of 0.9 or higher is generally preferred for measurement in a single person, whereas a Cronbach α of 0.7 or higher is preferred for group measurement.27
Validity: the extent to which the instrument actually measures what it claims (i.e., the correspondence between what the instrument reports and reality)
Responsiveness: the ability of an instrument to detect changes, particularly clinically important changes, over time in individuals or in groups of subjects
Applicability: the appropriateness of the instrument’s use in the specific study population
Practicality: the likelihood that an instrument can be applied readily, without excessive burden to patient or investigator and produce data that can be easily analyzed and applied
Cronbach α = a coefficient of reliability (or consistency) used to measure how well a set of items (or variables) measures a single unidimensional latent construct. It ranges from 0 to 1. (For details, see http://www.ats.ucla.edu/stat/spss/fqq/alpha.html.)
Of the many generic instruments available to assess HRQOL, four validated, widely used questionnaires stand out. Brief descriptions of these instruments are given in Table 106-3.
Generic Outcomes Assessment Tools
Name of Instrument | Internal Reliability (Cronbach α) | Cross-Validation Instruments | Number of Items | Number of Domains | Time to Complete (min) | References |
Nottingham Health Profile (NHP) | Cronbach α was reported as 0.77–0.85 for the first section and 0.44–0.86 for the second section in a sample of patients with OA | SIP SF-36 COOP WONCA EQ-5D | 37 | Six plus physical abilities, pain, sleep, social isolation, emotional reactions, and energy level A second section includes optional questions about work, social and sex life, interests and hobbies, and holidays | 10–15 | Hunt and Mc |
Medical Outcomes Study 36-item Short-Form Health Survey (SF-36) | Cronbach α in both general and chronic disease populations ranges from 0.78–0.93 | Oswestry Disability Index SIP NHP EQ-5D Social Maladjustment Schedule WOMAC OA Index Chronic Pain grade Questionnaire | 36 | Eight scales of general health and functioning: physical functioning, role—physical (limitations in physical roles caused by health problems), bodily pain, general health, vitality, social functioning, role—emotional (limitations in emotional roles caused by health problems), and mental health | 10–15 | Tarlov et al.,32 Stewart,33 Ware,34 Mc Horney et al.,35 Stansfeld et al.,36 Grevitt et al.37 |
Sickness Impact Profile (SIP) | Cronbach α = 0.94 Test–retest reliability r = 0.92 | NHP SF-36 EQ-5D MMPI | 136 | 12 | 20–30 | |
European Quality of Life (EQ-5D, Euro-QoL) | Test–retest reliability in stroke patients: κ 0.63–0.80 | SF-36 NHP COOP WONCA | 15 | Five dimensions: mobility, self care, usual activities, pain/discomfort, and depression/anxiety The sixth item is a global evaluation of one’s own health using a VAS of 0–100 (worst imaginable health to best imaginable health) | Few | See also http://www.euroqol.org |
Pain, in general, and chronic and persistent pain, specifically, is a unique challenge to outcomes research because of the importance of subjective information. Unlike the majority of other medical conditions, chronic pain may not involve a distinct organ system, pathophysiologic process, or specific discipline. Although pain is characterized as a symptom, it is, in fact, a subjective experience, a perception.41 This perception not only depends on nociceptive transmission and modulation within the central nervous system but also is integrated with psychological, social, and other environmental factors.42 Physical functioning, work, family, and social relationships are usually impaired by chronic pain. Comorbid conditions, such as depression and anxiety, often accompany chronic pain.43 For these reasons, it is argued that the assessment of patients with chronic pain should be accomplished within a multidimensional framework.44 Assessment of chronic pain should provide clinicians with relevant information to formulate a treatment plan and allow for measurement of the outcome of treatment interventions. The generic HRQOL instruments discussed earlier are mostly epidemiologic tools and as such are able to measure change in large samples of patients. By design, they are not intended, nor are they sufficiently sensitive, to measure changes in a single subject. Furthermore, these instruments do not provide information on items frequently assessed in pain management, such as solicitous responses,45,46 coping ability,47,48 fear avoidance,49–51 and the extent of disablement from pain.
Many instruments are used to assess the impact of pain on patients’ lives. Ideally, the instrument should provide relevant information to all clinicians within an inter- or multidisciplinary team, have a low respondent burden, and be sensitive enough to detect changes at both group and individual levels. Widely used methods to assess pain and its influence range from domain to condition specific. Some of the most frequently used tools are presented next.
The three most commonly used methods to assess pain intensity are the verbal rating scales, visual analog scales, and numerical rating scales (Table 106-4). Von Korff et al.52 cautions that multiple factors influence patients’ pain reports, including time of day. Aggregated pain measures have, therefore, been shown to be more reliable and more sensitive to treatment effects than single items.53 Aggregated pain measures are scores that are created from multiple measures. For instance, the average of three concurrent responses to a 100-mm visual analog scale of pain intensity ratings of current, average, and best pain can be taken.54 A composite measure shown in cancer pain patients to have high internal consistency (Cronbach α >0.8) consists of an average of ratings on a 0-to-10 scale of current, least, and average pain.55 Jensen and colleagues56 report that individual 0-to-10 pain intensity ratings have sufficient psychometric strength to be used in chronic pain research, especially in studies with large sample sizes, but composites of 0-to-10 ratings may be more useful when maximal reliability is necessary (i.e., in studies with small sample sizes or in the monitoring of an individual patient).
Pain Intensity Scales
Scale | Description |
Verbal rating scale (VRSs) | A list of adjectives describing different levels of pain intensity (e.g., 0 = no pain, 1 = slight pain, 2 = moderate pain, 3 = severe pain) |
Visual analog scales (VASs) | Lines that are usually 100 mm long and represent the continuum of the symptom being rated, with labels at either end to represent the extremes of the symptom (e.g., 0 = “no pain,” 100 = “pain as bad as it could be”) |
Numerical rating scales (NRSs) | Ascending sequences of numbers, each representing increasing levels of pain intensity (e.g., 11-point scale in which 0 = no pain, 10 = worst possible pain) |
Verbal rating scales (VRSs) are positively and significantly related to other measures of pain intensity.26 Jensen and colleagues57 reported on the potential clinical utility of classifying pain as mild, moderate, or severe based on the impact of pain on quality of life. There is a nonlinear relationship between pain intensity and pain interference. Pain intensity begins to have a serious impact on functioning when it reaches a specific threshold: about 5 on a 0-to-10 scale in patients with cancer pain.55 To explore in greater detail the relationship between pain severity and interference in patients with cancer pain, Serlin and colleagues55 administered the Brief Pain Inventory (BPI) to a total of 1897 patients from numerous sites in the United States, France, China, and the Philippines. In this classic study, they gathered self-reported data on pain severity as well as interference by pain with enjoyment of life, activity, walking, mood, sleep, work, and relations with others. These four diverse populations had “fairly consistent patterns relating pain severity to pain interference.” Statistical analyses showed that pain severity on a 0-to-10 verbal numerical rating scale could be stratified according to the degree of interference it produced as mild (1–4), moderate (5–6), and severe (7–10).
These are simple tools to assess intensity and other dimensions of pain, such as anxiety, efficacy of treatment, and emotional responses26,58,59 (Fig. 106-1). Patients mark the scale at a point that represents the severity of their pain at a specified time point or within a well-defined interval (e.g., the past 24 hours). Variations of these techniques request that patients circle a number from 0 to 10 or place a mark through one of these numbers. VASs are more sensitive and precise than descriptive scales. They are also easy to use and interpret; however, they are limited to expressing only one dimension of the complex experience of pain. It may be difficult for patients to imagine the worst pain imaginable, or they might report their pain as being outside the 0-to-10 limits, saying that it is a 20, for example.
The validity of VASs is supported by their positive relations to other measures of pain intensity.24,60 They are sensitive to treatment effect and are distinct from measures of other subjective components of pain.52
Numerical rating scales (NRSs) were demonstrated to provide sufficient levels of discrimination for patients with chronic pain to describe their pain intensity.61 Similar to VRSs and VASs, NRSs demonstrate positive and significant correlations with other measures of pain intensity.26,60
The questionnaire provides estimates of the sensory, affective, and evaluative dimensions of pain.62 It is one of the most frequently used instruments for pain measurement and is considered useful for evaluating pain treatments and as a diagnostic aid.26,63–66 In addition to collecting information about diagnosis, drug therapy, pain and medical history, and other symptoms and modifying features, the MPQ contains a list of words that describe pain, divided into groups pertaining to the sensory, affective, and evaluative dimensions of the pain experience.