OMICS
Novel large-scale and data-rich technologies have been developed to examine various fields of biology including genomics, transcriptomics, proteomics, and metabolomics. These new approaches have had a dramatic impact on our understanding of various biological processes and have revealed complex interactions among them. Integration of the analyses of these datasets to gain insights that may not be revealed by analyzing them individually has gained widespread interest and is beginning to impact a number of medical disciplines including critical illness.
Omics refers to approaches that globally assess a specific class of biological molecules in a cell, tissue, or biological fluid. For example, genomics refers to the study of the structure and function of the entire genome, which is composed of all DNAs within a cell. Likewise, study of the transcriptome (RNA transcripts in cells or tissues), proteome (proteins in cells or tissues), and metabolome (small metabolites in cells, tissues or bodily fluids) give rise to the fields of transcriptomics, proteomics, and metabolomics. In addition, epigenomics is the study of
global epigenetic changes in cells or tissues. High-throughput, high-dimensional study of the genome, transcriptome, proteome, metabolome, and epigenome has produced a massive amount of interconnecting data from many biological systems and processes. However, the sheer volume and complexity of these data make analyses and interpretation extremely challenging. Increasingly sophisticated bioinformatics tools and computational analyses are required to perform studies using
these technologies. These “omic” approaches may provide better insight into highly complex diseases, such as sepsis, trauma, and other illnesses with multiorgan dysfunction that could result in improved diagnostic tests and novel therapies. In the following sections, these various “omics” disciplines will be discussed in relation to critical illness and injury.
GENOMICS
Genomics is the study of the structure and function of
DNA within cells. Genomics includes efforts to determine nucleotide sequences, to conduct fine-scale genetic mapping, and to analyze interactions between loci that occur within the genome. The key technology driving genomics is high-throughput
DNA sequencing combined with bioinformatics to analyze the large volumes of data. High-throughput
DNA sequencing technology facilitated the sequencing of the entire human genome, in addition to the genomes of many other organisms. These data revealed that many sites in the human genome are variable and there are numerous differences in
DNA sequences between individuals.
The most frequent type of genetic variation is a single nucleotide polymorphism (
SNP) that results from a nucleotide substitution. Variations may also arise from insertions or deletions of
DNA fragments or from the presence of a variable number of tandem repeats (VNTRs) of short, repetitive
DNA sequences. In some individuals, the differences in sequence are large (>1 kilobase), resulting in alterations in
DNA copy number. Such variants are called copy number variants (CNVs). CNVs are relatively common in human genomes and contribute significantly to human genetic variation. Given that individuals have two copies of each gene, an individual is either heterozygous or homozygous for one or the other variant whether it is an
SNP,
VNTR, or
CNV.
Variants do not necessarily affect the expression or the function of the gene product, particularly when they occur in a noncoding region of the gene that is not involved in regulating messenger RNA (mRNA) transcription from a
DNA template or in mRNA processing or stability. Variants resulting from large changes in coding regions are likely to affect the protein product; however, many SNPs in the coding region do not affect gene function or stability if the encoded amino acid remains the same (a silent substitution) or if the amino acid substitution does not affect protein stability or function. There are instances where genetic variants, including SNPs, affect protein expression (by altering noncoding regulatory regions of the gene) or function (by altering the amino acid sequence), but not all such changes are necessarily deleterious. In many instances, genetic variants explain the variation in protein levels observed in the general population. Variants that alter protein levels or function are partially responsible
for genetically determined variation in our physical characteristics, physiology, and personality traits. Genetic variability also explains some of the variability in disease susceptibility, disease severity, and response to treatment that is observed in patient populations.
Genotyping to identify specific variants in a particular gene is commonplace for diagnosing genetic disorders in the clinical setting. Nearly all genotyping techniques utilize polymerase chain reaction (
PCR) to amplify a
DNA fragment that contains the site of interest. For amplification,
PCR uses small pieces of
DNA, termed primers, which are complementary to the regions that flank the site of interest. Early techniques identified genotypes based on the size of the
PCR product (insertions or deletions, VNTRs, SNPs present in restriction enzyme recognition sites) or by using allele-specific hybridization approaches, such as allele-specific
PCR or hybridization with labeled allele-specific oligonucleotide probes. Genotyping is now routinely performed as a single-site assay or by genotyping hundreds to millions of SNPs using custom-made arrays or arrays that probe for SNPs from across the entire genome (genome-wide
SNP arrays). These techniques are more amenable to high-throughput technology. Genome-wide
SNP arrays are used for genome-wide association studies (GWASs), which examine millions of SNPs simultaneously to determine whether any are associated with specific diseases.
Next-generation sequencing technologies are commonplace (
1), partly due to a reduction in the cost of
DNA sequencing. Using this technique,
DNA is randomly fragmented and ligated to common adaptor sequences to form a library. The library is hybridized to an array platform of millions of spatially fixed
PCR fragments that are complementary to the
DNA fragments. Given that the array contains multiple
PCR fragments that are complementary to a
DNA library fragment, millions of short
DNA reads are produced in a highly efficient manner. Following enzyme-driven biochemistry and imaging-based data processing, short reads are mapped to a source genome to generate a sequence read. Recent advances in
DNA sequencing technologies, combined with reduced cost, make sequencing the human genome much less expensive and make it realistic to use this technology to identify novel genetic variants.
The technologies described above have been used to examine the influence of genetic variation on the susceptibility to, and outcomes of, various diseases relevant to critical care. This information may allow physicians to identify children who are at greatest risk for poor outcomes, allowing for modified monitoring strategies in the intensive care unit or novel therapies. Several genes that harbor genetic variations associated with the severity of sepsis or acute lung injury (
ALI) in critically ill populations have been described (
Tables 18.1 and
18.2). One example is the cystic fibrosis transmembrane conductance regulator (
CFTR) gene, which codes for a chloride channel protein expressed on epithelial cells in bronchi, bronchioles, and alveoli (
2,
3,
4,
5). Influx of fluid into the alveoli following increased permeability of the alveolar-capillary barrier is one of the hallmarks of
ARDS (
6), and the ability to clear fluid rapidly is associated with improved outcome (
7). The clearance of alveolar fluid occurs through active ion transport (
8), and
CFTR has a role in both cyclic adenosine monophosphate-stimulated fluid absorption and modulation of the epithelial sodium channel (
2,
3,
9).
CFTR contains 27 exons that are spliced together to give mature
CFTR mRNA. Alternatively spliced transcripts are relatively common, and levels of
CFTR protein and activity vary between individuals (
10). Mutations in the
CFTR gene cause cystic fibrosis (
CF), a disease characterized by progressive injury to the lungs (
11). Interestingly,
in vitro and
ex vivo studies suggest that
CFTR deficiency results in a dysregulated inflammatory response (
12,
13,
14,
15,
16,
17) and promotes lipopolysaccharide (
LPS)-induced lung injury in mice (
15,
18), suggesting that
CFTR has immune modulatory activity.
In addition to the relatively rare variants that affect
CFTR function, two common polymorphisms affect the function of
CFTR. One polymorphism is the (TG)
mT
n variable repeat region located in intron 8. Both
in vitro and
in vivo studies show an association between either a higher number of TG repeats and/or a lower number of Ts with an increased proportion of mRNA transcripts deficient in exon 9 (
19,
20,
21,
22,
23). Mechanistic studies also reveal that different alleles at the (TG)
mT
n site affect exon 9 skipping owing to differences in the binding affinity of splicing regulatory proteins (
24,
25). Exon 9 is essential for
CFTR function given that, together with exons 10-12, it encodes the first nucleotide-binding domain, and mRNA transcripts without exon 9 do not produce functional
CFTR (
26,
27,
28). In healthy individuals, 5%-90% of
CFTR transcripts are missing exon 9 (
29), suggesting that
CFTR activity in healthy individuals varies greatly. Although
CFTR activity may be reduced to <5% of normal in
CF, other variants that have less profound effects on
CFTR may still increase the risk of other lung diseases (
11). We examined the (TG)
mT
n alleles in a cohort of children with community-acquired pneumonia (
CAP). African American children with
CAP who have (TG)
mT
n alleles associated with increased exon 9 skipping are more likely to require mechanical ventilation and to develop
ARDS (
30). These data suggest that less functional
CFTR may contribute to more severe lung injury and that the genetic makeup of the host may contribute to an increased risk for
ARDS.