Interpreting the Medical Literature





Key Points





  • The medical literature can be accessed in many ways, ranging from the primary literature as indexed in sources like PubMed, to sources catering to medical professionals, and via the lay press and social media as well.



  • The quality of medical literature varies widely, as does the interpretation or distillation into secondary sources of information.



  • The traditional “evidence pyramid” no longer effectively reflects the diversity and impact on clinical care of the rapidly evolving body of medical knowledge.



  • Understanding the design of a research study is crucial to understanding the strength and implications of its conclusions.



  • Different research study designs have assessment tools, maintained by the EQUATOR Network, to assist in understanding the quality of the study.



  • Statistical analysis is commonplace in original research manuscripts, but the interpretation of statistical results, particularly the probability or “ P -value,” is widely misunderstood and misused.



  • Although the hallmark of the primary medical literature is a rigorous peer review process, there are serious pitfalls that may be difficult to detect, including predatory journals and reviewer misconduct.





Introduction


Reading and interpreting the medical literature is a critical skill for any practicing anesthesiologist. The numerous scientific advances made over the last few decades make it imperative for anesthesiologists to understand how to read, interpret, and apply the medical literature to the clinical scenarios that arise in daily practice. Critical appraisal, a concept that first arose in the early 1980s, is “the process of carefully and systematically examining research to judge its trustworthiness, and its value and relevance in a particular context.” The idea that clinicians “should practice medicine in a way that combines research evidence with clinical skills and patient values and preferences” led to the creation of a new approach to caring for patients called “evidence-based medicine,” a term first coined in the early 1990s by Gordon Guyatt at McMaster University.


In response to this new approach to patient care, where the most recent published evidence could be used to support clinical decisions in lieu of the usual approach of deferring to a physician’s seniority, intuition, or prior experiences with similar patients, the Journal of the American Medical Association published a series of articles entitled “Users’ Guides to the Medical Literature” in the early 1990s that has since been formatted into a textbook by the same name. This text is an excellent resource for those who are interested in effectively incorporating the systematic evaluation of the medical evidence into their daily clinical practice. We do not purport to write as complete a chapter on interpreting the medical literature as the existing resources that are already available to the lay reader.


This chapter is, however, intended to provide some basic information about how medical evidence is created and published, some useful tools for evaluating the medical literature, and to highlight some of the pitfalls to avoid when sorting through the litany of information available to both physicians and to the lay public. The previous chapter provided an introduction to the different study designs used in clinical research and their relative strengths and weaknesses. This chapter will aim to put the key points raised in the previous chapter within the context of how research can be interpreted and used by a clinical anesthesiologist.


Intent of the Chapter


The goals of this chapter are: (1) To provide a brief overview of how a research manuscript is treated from the moment of submission to its publication date; (2) to provide a practical guide for accessing, interpreting the quality of, and implementing the knowledge gained from reading the medical literature; and (3) to identify and avoid the pitfalls of indiscriminate use or the misuse of the published evidence.




The Publication Process


Types of Journals


With the advent of the Internet, researchers and consumers of research have an overwhelming array of journals to consider for submitting and reading original work. A journal’s focus and target audience can vary. Some journals are known as general medical journals because they include work from many different fields within medicine. Examples include The New England Journal of Medicine (NEJM), Journal of the American Medical Association (JAMA), The BMJ (formerly known as the British Medical Journal), and The Lancet . Most medical journals, however, focus on a particular medical specialty, or they may highlight particular topics such as research methods or health policy.


Journals can be accessed as printed and bound periodicals requiring an active print subscription, through a designated website offering a selection of free articles or articles requiring payment to view, or often by using a combination of the two delivery methods. The quality and reputation of journals can vary greatly—some require little to no peer review and an upfront payment in exchange for quick online publication of a manuscript of questionable quality (see Predatory Journals , later). Others maintain extremely high standards for manuscript acceptance and editorial review. Publication in a medical journal with an established history as a traditional print journal, even while it maintains a distinct web presence, is usually an indicator that a published manuscript has been subjected to peer review and represents a worthwhile contribution to the medical literature.


A number of professional societies in anesthesiology publish journals of peer-reviewed literature relevant to the field and include some of the most reputable journals within the field, such as the American Society of Anesthesiologists (Anesthesiology) , the Royal College of Anaesthetists/The College of Anaesthetists of Ireland/the Hong Kong College of Anaesthesiologists (British Journal of Anaesthesia), and the International Anesthesia Research Society (Anesthesia & Analgesia) .


Types of Journal Articles


Although each journal’s priorities for content may vary, for the most part, journals publish similar types of articles. These article types can be grouped into the following broad categories: original research, review articles, brief reports or letters, case reports, and editorials.


Original research is the most familiar and most common type of article that gets published. Original research usually consists of a scientific manuscript that reports the full results from a research study and can represent any of the study designs that were described in Chapter 89 .


Review articles summarize the existing scientific research on a given topic and are a good way for readers to quickly familiarize themselves with the current evidence within an area of research. Review articles are comprehensive, typically written by experts in the field, and are often solicited by the editors of a journal. The authors will usually frame published research within the context of other contemporaneous works and the current and future directions of the research topic being reviewed.


Brief reports and letters provide concise research reports that address a timely issue or spur further research if published before a full original research manuscript has been submitted. Letters are also an opportunity for readers to submit arguments extending or rebutting articles that were previously published in that journal.


Case studies allow researchers or clinicians to share specific examples of unusual or unexpected clinical findings in a single patient that may be instructive to a broader readership. A case series is similar to a case study but describes similar clinical phenomena across multiple patients.


Editorials are essays that encompass the opinion of the author about a subject, usually of topical relevance, or highlight the important scientific contributions of an original research article that was published in the same journal issue. Similar to review articles, editorials are usually solicited by the journal editor, written by experts in the field, and provide important context with which to frame the original research article.


The Peer Review Process


The peer review process is an important component of publishing original research. Most respected journals, whether web-based or traditional print journals, will have a robust peer review process in place. Once an original research manuscript has been submitted by the authors, the journal editors will usually make a rapid decision about whether the study topic is appropriate for their readership. Authors will either receive notification that their submission was rejected without review, or that the manuscript has been sent out for review to at least two other experts in the field. Reviewers are asked to rate various aspects of the manuscript, including its readability, novelty, methods, validity of results , and potential impact on the field. They will often provide constructive feedback to the authors on ways the study can be substantially improved. Reviewers will then make a recommendation to the journal editors whether the study should be accepted, revised and resubmitted, or rejected. The editors will consider the reviewers’ comments in making the final decision about the manuscript’s disposition.


Although different journals may have different terminologies for these final decisions, in general, they fall into the following categories: accept as written, conditional acceptance (i.e., accept with minor revisions or accept with major revisions), revise and resubmit, or reject. Acceptance of a manuscript without any changes is extremely rare. Conditional acceptance , though frequently considered to be a positive outcome, is not a guarantee of acceptance. The editors still retain the right to reject the manuscript unless the manuscript’s authors have satisfactorily addressed any concerns that are deemed important enough to be raised in the outcome letter. More common is a decision to revise and resubmit . The editor may require substantial changes to the original manuscript and usually requests a new manuscript submission that clearly highlights within the text what changes have been made in response to the outcome letter. Most of the time, the authors are also required to submit an accompanying document that responds to each of the points that may have been raised by the editor and reviewers during the peer review process. Finally, the editors can still decide to reject a manuscript after sending it out for review.


Once a manuscript has been through the peer review process and deemed acceptable for publication, the journal will then format the entire manuscript to fit the journal’s style. This usually includes recreating or reformatting the tables and figures that were submitted with the manuscript, as well as detailed copyediting for grammar, punctuation, and clarity. Proofs of the formatted article, which show how the article will actually appear when it is finally printed, are sent to the author for final approval. Simultaneously, the journal will select the print issue that will feature the accepted article and decide whether the manuscript would benefit from an accompanying editorial. Although print journals usually require a lead time of several months after an original article has been accepted before it is actually published, a journal will often set an earlier date for online publication, typically referred to as “e-pub ahead of print.” Expediting the dissemination of original research by publishing electronically on a journal’s website allows readers to obtain the earliest access to interesting and timely research.




Accessing the Medical Literature


Primary Literature


In the past, readers needed to have a journal subscription or access to a medical library to read published articles. In the academic setting, it was very common for attending physicians to photocopy important articles and distribute them to their trainees. As in so many other arenas, access to research articles has been transformed with the arrival of the Internet, and most articles are now easily accessible online, either directly through the journal’s website, or through indexed search engines. Primary literature refers to these original research articles that are authored by the researchers who performed the study and are published in peer-reviewed journals.


Most readers are probably familiar with PubMed , a free resource maintained by the National Center for Biotechnology Information at the US National Library of Medicine, which is located at the National Institutes of Health. PubMed provides access to MEDLINE, an online database that contains more than 28 million references to journal articles in life sciences, with a concentration on biomedicine. The database includes medical literature published from 1966 to the present day, with citations from more than 5200 journals in approximately 40 languages worldwide, with new articles being added daily. Search results include a list of article citations with links to the electronic full-text article if available. Although PubMed is most frequently used to access the primary literature, this resource is useful for accessing the secondary literature as well, which is described in the following section. Searching PubMed is free of charge, and many articles are available for viewing without a subscription or an academic institutional affiliation, including through the archival of full-text articles in PubMed Central. PubMed provides various mechanisms to access articles that are not freely available on the Internet, although fees may apply.


Secondary Literature


In addition to primary sources, physicians may rely on other resources to remain informed about recent advances in their field. The term secondary literature is used to refer to written summaries of the primary literature that help synthesize or evaluate primary sources for the purpose of dissemination and incorporation of evidence-based medicine into practice. These articles summarize the primary literature to varying degrees and with variable quality, depending on their purpose. Systematic reviews and metaanalyses are, in and of themselves, valued as high-quality research and important contributions to the medical literature. For example, the Cochrane Database of Systematic Reviews is a well-known and highly regarded resource for systematic reviews in health care. Readers should be aware, however, that narrative reviews differ from systematic reviews in that they are not obliged to provide unbiased information reflecting the totality of available knowledge. Narrative reviews can be useful, and an efficient source of information particularly when written by authors who might be expected to have an expert command of the available literature, but fundamentally differ from systematic reviews because they are not as rigorous or comprehensive in their approach to manuscript selection, which must be considered in their interpretation. Types of reviews and metaanalyses are further discussed in Chapter 89 .


Clinical practice guidelines fall into the category of secondary literature as well. These are typically written by professional groups or government agencies to help guide clinician decision making and will often indicate the level of evidence supporting the practice recommendations that are being made. Other more filtered—yet still evidence-based—distillations of the existing research exist to facilitate clinical decision making by practicing clinicians at the bedside. Examples of the latter include websites like UptoDate and WebMD.


Traditional and Social Media


Finally, the medical literature can be indirectly accessed via both traditional and social media. Traditional media may include press releases issued by the author’s institution or coordinated by the print journal to maximize the impact and newsworthiness of an important scientific advance. These press releases can often lead to articles in leading newspapers or newsmagazines that refer to the original article if they are considered of interest to the general public. However, traditional media can also inadvertently or deliberately present an inaccurate or sensationalized version of the article’s conclusions, potentially not tempered by the study’s limitations. Media reports of peer-reviewed articles are not necessarily written in collaboration with the authors of the article, and even though quotations may be excerpted from press releases and other sources, conclusions can be misrepresented. If a layperson’s interpretation of a scientific finding is to be used for clinical practice, the validity of the conclusions presented must be verified against the original article itself.


Many scientists and clinicians are now keeping current with the latest evidence through social media sites such as Twitter and Facebook, or individual blogs, where leading scientists, researchers, and clinicians can link to original articles and provide their own spin or commentary on the relative merits or shortcomings of the most recently published research. Other less reliable, but easily accessible, resources include crowd-sourced websites such as Wikipedia. However, the quality of the information presented on individual blogs or crowd-sourced sites depends significantly on the credentials of the individuals contributing to the website. Although the democratization of the medical literature has accelerated the dissemination of new research to both scientific audiences and to the public, it is still important for individual clinicians to understand how to approach the literature independently to separate the propaganda surrounding the latest research from the actual strengths and weaknesses of the study itself.




Assessing the Methodology of a Study


Understanding how and why a study was done is essential to understanding how its findings fit in with the progress of medical science. The quality of clinical research is highly dependent on many design choices that were made long before recruitment of the first patient, or collection of the first data record, and a familiarity with clinical research methodology will help all medical professionals critically evaluate the applicability of published findings to their own practice decisions.


As a first pass, it is easiest to separate research designs into two bins: observational and interventional . Within those designations are subcategories and variants with important implications for study quality. Furthermore, some studies such as metaanalyses incorporate aspects of both. Contrary to older notions of research interpretation, a randomized trial does not always produce better evidence than an observational (cohort) study, and a metaanalysis—far from being the “pinnacle”—is only as good as the evidence on which it rests.


The “Evidence Pyramid” and its Evolution


Historically, those interested in evaluating study quality have been referred to an “evidence pyramid,” where each step up the pyramid represents a step closer to truth (or quality, or best evidence) ( Fig. 90.1 ).




Fig. 90.1


The Evidence Pyramid.


This pyramid produces a striking visual representation emphasizing the weaknesses—and relative abundance—of expert opinion and observational case studies, the importance of randomized controlled experimental trials, and the primacy of more recently developed summative methods, including systematic reviews and metaanalyses, which comprise the apex of evidentiary quality. But this is overly simplistic in today’s world of complex patients with multiple intersecting conditions: the pyramid, with its unlabeled y-axis that may be thought of as “risk of bias” or “internal validity,” rather than a progress toward “truth,” is due for modification.


Two attempts to refine this pyramid are in Figs. 90.2A and B . Fig. 90.2A highlights the variability in quality within a single type of study design, and the dependence of summative methods like systematic reviews upon the quality of existing evidence, whereas Fig. 90.2B avoids the appearance of a hierarchy altogether, emphasizing that data derived from different methods is necessary to provide a strong foundation for scientific knowledge. Another graphical reconceptualization of the way scientific evidence is generated was proposed by Walach and colleagues in 2006 (see Fig. 90.2C ), and recently updated toward a “matrix” concept: their “Circle of Methods” provides a more granular categorization that differentiates between efficacy and effectiveness, a crucial concept in the evolution of medical knowledge toward changing care for broad populations of patients.




Fig. 90.2


Proposed refinements of the “Evidence Pyramid,” reflecting the lack of consensus in the scientific community about how to visually encapsulate the relationships among methods of generating evidence. (A) A refinement proposed by Murad and colleagues. (B) The Greek Temple model. (C) Circle of Methods.

[A], Redrawn from Murad MH, Asi N, Alsawas M, Alahdab F. New evidence pyramid. Evid Based Med . 2016;21(4):125–127. [B] Redrawn from Salvador-Carulla L, Lukersmith S, Sullivan W. From the EBM pyramid to the Greek temple: a new conceptual approach to guidelines as implementation tools in mental health. Epidemiol Psychiatr Sci . 2017;26(2):105–114. [C], Redrawn from Tugwell P, Knottnerus JA. Is the evidence pyramid now dead? J Clin Epidemiol . 2015;68(11):1247–1250.


Perhaps the issue is that no single graphic can capture the unique strengths and limitations of the common experimental designs upon which the advancement of medical science relies. The perfect study would have no bias and a high level of external validity, reflecting some scientific truth that is universally applicable. This is probably impossible to do. Instead, when choosing among different clinical research designs, researchers are faced with many trade-offs between internal and external validity, feasibility (including costs), and risk of bias ( Fig. 90.3 ).




Fig. 90.3


The relationships between sample size (often implying higher costs), external validity, and risk of bias produces trade-offs in study design. Here, bubble size is proportional to the typical number of participants in a particular type of study (larger bubble indicates more participants).


Basic Research Designs


Trade-offs between observational and interventional designs are discussed more fully in Chapter 89 . Briefly, traditional observational studies include cohort studies, case-control studies, and cross-sectional studies, which are unified by their non-interventional nature. This results in risk of bias, which can be mitigated (but not eliminated) by sensible experimental design and analytic choices. Interventional trials are the most used method of generating data to demonstrate causal relationships. While observational studies are subject to confounding by observed or unobserved variables, interventional trials take variable advantage of features meant to minimize bias: use of controls, randomization , and blinding . Understanding the implications of experimental design choices in these three areas, which are discussed in Chapter 89 , underpins the assessment of trial quality.


As mentioned earlier in this chapter, summative studies like systematic reviews and metaanalyses offer another method of objectively summarizing available evidence and are discussed further in Chapter 89 . Just as a summary of medical evidence presented on a layperson’s news website must be approached with caution, a summary of evidence via metaanalysis or systematic review nonetheless glosses over what may be important differences in the quality and methodology of the underlying studies, and is highly reliant on the skill and thoroughness of its authors. Far from being an easy way to understand “scientific truth” in the literature, summative methods of generating evidence hold as many pitfalls as their underlying trials. However, in skilled hands and with the right underlying materials, systematic reviews and metaanalyses can provide important evidence, which could not feasibly be generated any other way.


Big Data and Pragmatic Clinical Trials


“Big data” studies have a unique and evolving place in the continuum of evidence. With the advent and widespread adoption of electronic medical records, enormous quantities of data are recorded every day during the routine provision of clinical care. This has led to two major evolutions in clinical research: the large secondary data cohort study and its interventional correlate, the pragmatic clinical trial .


Traditionally, data collection decisions for prospective cohort studies were based on the research query itself: for example, a study of lung function over time would obtain annual formal pulmonary function tests. All the anticipated necessary variables would be collected—height, weight, pulmonary function test results, chest x-rays, medication lists, exercise habits, dust mite exposure, and a detailed tobacco use history. From these variables, a (potentially extensive) list of confounding variables would be selected and, in mathematical modeling, accounted for, thus producing an estimate of the “independent effect” of the primary predictor, for example, air pollution exposure on the outcome of lung function.


Collecting extensive data is expensive, though, and requires time and resources. What if dust mite exposure and exercise habits are not hypothesized to be particularly important—could those be eliminated? What if, for no cost and only the time of an analyst to extract the data from an electronic medical record, we could use current tobacco use as a surrogate for a thorough tobacco use history, instead of hiring a clinical research nurse to collect the information? And if we felt that the height, weight, pulmonary function tests, and smoking history collected in the course of normal clinical care would be of sufficient quality, perhaps we could estimate air pollution exposure levels based on proximity to major roads—and, by eliminating the need to directly collect any specific data on our population, expand our population target to all those with valid values of these variables in our electronic medical records system? Thus, the “big data” study is born.


A more generic term for “big data” is “ secondary data ”: data collected for purposes other than the study of a researcher’s intended question. There are important fundamental differences between a traditional cohort study and the approach described here. Inherent in the compromise is a move from detailed and specific characterization of each participant to a more diffuse idea of individual participant characteristics, and a move from a small (and likely specific) population, compromising external validity, to one that is large and generalizable. Some phenomena simply cannot be studied any other way because they involve a subtle effect or rare outcome requiring large populations to show an association that exceeds our currently acceptable margins of uncertainty in strength—hence the term “big data.” Some phenomena are particularly suited to study using this approach, because (acceptably) high-fidelity data are collected in the course of routine clinical care. One such example might be studies of association between postoperative respiratory adverse events and intraoperative mechanical ventilation parameters. Studies on the epidemiology of medical care, or its financial costs, are highly dependent on the availability and quality of secondary data.


There is a growing appreciation that the data collection infrastructure that supports the provision of daily medical care can also be used to support large-scale pragmatic clinical trials . Although the philosophy of a pragmatic trial was described before the widespread implementation of electronic medical records, these studies have been greatly facilitated by the increase in availability of secondary data. In contrast to the typical randomized controlled trial, a pragmatic trial seeks to avoid restrictive inclusion criteria and highly protocolized care, more closely recapitulating the way medical care is provided to individual patients. They are typically categorized as effectiveness , rather than efficacy , trials. This means that findings may have excellent external validity (applicability to a broad population of patients and settings). But since the lack of protocolization may introduce random or nonrandom variability in care, these trials are typically very large: thousands or tens of thousands of participants. Pragmatic trials may also be more vulnerable to risk of bias than traditional randomized trials, since it may not be practical or possible to blind clinicians and/or participants, in addition to challenges introduced by using secondary data (described earlier).


One recent pragmatic trial example relevant to anesthesiology is the SMART trial, published in NEJM , which randomized 15,802 critically ill patients to receive normal saline (0.9% sodium chloride) or a balanced salt solution. The pragmatic features in this trial included: a cluster-randomized design, where the ICU in which care was provided determined which solution the patient received (instead of individual randomization, which introduces additional protocol adherence challenges); use of the electronic medical record to prompt ordering providers to consider relative contraindications and, if none were present, follow the protocol (rather than having study interventions delivered only by study personnel); and outcome and adjustment variable collection by the electronic medical record (i.e., use of secondary data). Importantly, both interventions—the use of saline versus a balanced solution when intravenous crystalloid solution is required—are standard of care, and the requirement for informed participant consent was, in this trial, waived. The target sample size was achieved in fewer than 2 years of enrollment, with less than 1 year between completion of enrollment and publication. Concurrently, a second complementary trial—of saline versus balanced crystalloid in noncritically ill adults receiving intravenous fluids in the emergency department (SALT-ED trial)—was also run, enrolling over 13,000 patients with a similar randomization scheme and data collection methodology. These two trials substantially contribute to a long-running debate in medicine, and are excellent examples of pragmatic clinical trials.


Evaluating the quality of a study based on data collected for other purposes is complex, challenging, and beyond the scope of this chapter. Fundamental differences in bias, generalizability, and even primary findings can be wrought by small and potentially undetectable design or analytic choices, such as geographically or socioeconomically limited populations; systematic problems in data quality resulting from misaligned incentives; inclusion or exclusion (by choice, or because of unavailability) of key confounding variables; presence of missing data and its treatment; statistical coding errors; and so on. Nevertheless, secondary data research has supported preliminary hypotheses that have launched innumerable research inquiries generating measurable improvements in human health, has allowed us to define the scope and costs of the medical care provided today, offers the opportunity to study research questions that would otherwise be completely inaccessible because of cost or ethical barriers, and holds enormous promise in the burgeoning era of “personalized medicine.”

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Mar 7, 2020 | Posted by in ANESTHESIA | Comments Off on Interpreting the Medical Literature

Full access? Get Clinical Tree

Get Clinical Tree app for offline access