How to Analyze Survival Data

How to Analyze Survival Data: Introduction

Listen

All the methods that we have discussed so far require “complete” observations, in the sense that we know the outcome of the treatment or intervention we are studying. For example, in Chapter 5 we considered a study that compared the rate of filing advance directives in people who received in-person counseling or written instructions (Table 5-1). We compared these two groups of people by computing the expected pattern of thrombus formation in each of the two comparison groups under the null hypothesis that there was no difference in the rate of thrombus formation in the two treatment groups, then used the chi-square test statistic to examine how closely the observed pattern in the data matched the expected pattern under the null hypothesis of no treatment effect. The resulting value of χ2 was “big,” so we rejected the null hypothesis of no treatment effect and concluded that aspirin reduced the risk of thrombus formation. In this study we knew the outcome in all the people in the study after a fixed length of time following treatment. Indeed, in all the methods we have considered in this book so far, we knew the outcome of the variable under study for all the individuals in the study being analyzed. There are, however, situations, in which we do not know the ultimate outcome for all the individuals in the study because the study ended before the final outcome had been observed in all the study subjects or because the outcome in some of the individuals is not known.* In addition, it would be desirable to take into account the outcomes in people who were enrolled in the study for varying lengths of follow-up that allows for the fact that the more time that passes after treatment the more likely it is that there would be the outcome of interest. We now turn our attention to developing procedures for such data.

The most common type of study in which we have incomplete knowledge of the outcome are clinical trials or survival studies in which individuals enter the study and are followed up over time until some event—typically death or development of a disease—occurs. Since such studies do not go on forever, it is possible that the study will end before the event of interest has occurred in all the study subjects. In such cases, we have incomplete information about the outcomes in these individuals. In clinical trials it is also common to lose track of patients who are being observed over time. Thus, we would know that the patient was free of disease up until the last time that we observed them, but we do not know what happened later. In both cases, we know that the individuals in the study were event free for some length of time, but not the actual time to an event. These people are lost to follow-up; such data are known as censored data.** Censored data are most common in clinical trials or survival studies.

* Another reason for not having all the data would be the case of missing data, in which samples are lost because of experimental problems or errors. Missing data are analyzed using the same statistical techniques as complete data sets, with appropriate adjustments in the calculations to account for the missing data. For a complete discussion of the analysis of studies with missing data, see Glantz S, Slinker B. Primer of Applied Regression and Analysis of Variance, 2nd ed. New York: McGraw-Hill; 2001.

** More precisely, these observations are right censored because we know the time the subjects entered the study, but not when they died (or experienced the event we are monitoring). It is also possible to have left censored data, when the actual survival time is larger than that observed, such as when patients are studied following surgery, and the precise dates at which some patients had surgery before the beginning of the study are not known. Other types of censoring can occur when studies are designed to observe subjects until some specified fraction (say, half) die. We will concentrate on right censored data, since that is what generally comes up in biomedical studies.

Censoring on Pluto

Listen

The tobacco industry, having been driven farther and farther from Earth by protectors of the public health, invades Pluto and starts to promote smoking in bars. Since it is very cold on Pluto, Plutonians spend most of their time indoors and begin dropping dead from the secondhand tobacco smoke in bars. Since it would be unethical to purposely expose Plutonians to secondhand smoke, we will simply observe how long it takes Plutonians to drop dead after they begin to be exposed to secondhand smoke in bars.

Figure 11-1A shows the observations for 10 nonsmoking Plutonians selected at random and observed over the course of a study lasting for 15 Pluto months. Subjects entered the study when they started hanging out at smoky bars, and they were followed-up until they dropped dead or the study ended. As with many survival studies, individuals were recruited into the study at various times as the study progressed. Of the 10 subjects, 7 died during the period of the study (A, B, C, F, G, H, and J). As a result, we know the exact length of time that they lived after their exposure to secondhand smoke in bars. These observations are uncensored. In contrast, two of the Plutonians were still alive at the end of the study (D and I); we know that they lived at least until the end of the study, but do not know how long they lived after being exposed to secondhand smoke. In addition, Plutonian E was vaporized in a freak accident while on vacation before the study was completed, so was lost to follow-up. We do know, however, that these individuals lived at least as long as we observed them. These observations are censored.

Figure 11-1.

(A) This graph shows the observations in our study of the effect of hanging out in a smoky bar on Plutonians. The horizontal axis represents calendar time, with Plutonians entering the study at various times, when tobacco smoke invades their bars. Solid points indicate known times. Lighter points indicate the time at which observations are censored. Seven of the Plutonians die during the study (A, B, C, F, G, H, and J), so we know how long they were breathing secondhand smoke when they expired. Two of the Plutonians were still alive when the study ended at time 15 (D and I), and one (E) was lost to observation during the study, so we know that they lived at least as long as we were able to observe them, but do not know their actual time of death. (B) This graph shows the same data as panel A, except that the horizontal axis is the length of time each subject was observed after they entered the study, rather than calendar time.

Figure 11-1B shows the data in another format, where the horizontal axis is the length of time that each subject is observed after starting exposure to secondhand smoke, as opposed to calendar time. The Plutonians who died by the end of the study have a solid point at the end of the line; those that were still alive at the end of the observation period are indicated with a lighter point. Thus, we know that Plutonian A lived exactly 7 months after starting to go to a smoky bar (an uncensored observation), whereas Plutonian D lived at least 12 months after hanging out in a smoky bar (a censored observation).

This study has the necessary features of a clinical follow-up study:

There is a well-defined starting time for each subject (date smoking started in this example or date of diagnosis or medical intervention in a clinical study).
There is a well-defined end point (death in this example or relapse in many clinical studies).
The subjects in the study are selected at random from a larger population of interest.

If all subjects were studied for the same length of time or until they reached a common end point (such as death), we could use the methods of Chapters 5 or 10 to analyze the results. These methods require researchers to assess the outcomes at a fixed time follow the intervention, then classify each subject as either having or not having the outcome of interest or not. Unfortunately, in clinical studies these situations often do not exist. The fact that the study period often ends before all the subjects have reached the end point makes it impossible to know the actual time that all the subjects reach the common end point. In addition, because subjects are recruited throughout the duration of the study, the follow-up time often varies for different subjects. These two facts require that we develop new approaches to analyzing these data that explicitly take into account the length of follow-up when assessing outcomes. The first step is to characterize the pattern of the occurrence of end points (such as death). This pattern is quantified with a survival curve. We will now examine how to characterize survival curves and test hypotheses about them.

Estimating the Survival Curve

Listen

When discussing survival curves, one often considers death the end point—hence, the name survival curves—but any well-defined end point can be used. Other common end points include relapse of a disease, need for additional treatment, or failure of a mechanical component of a machine. Survival curves can also be used to study the length of time to desirable events as well, such as time to pregnancy in couples having fertility problems. We will generally talk in terms of the death end point, recognizing that these other end points are also possible.

The parameter of the underlying population we seek to estimate is the survival function, which is the fraction of individuals who are alive at time 0 who are surviving at any given time. Specifically,

the survival function, S(t), is the probability of an individual in the population surviving beyond time t.

In mathematical terms, the survival function is

Figure 11-2 shows a hypothetical survival function for a population. Note that it starts at 1 (or 100% alive) at time t = 0 and falls to 0% over time, as members of the population die off. The time at which half the population is alive and half is dead is called the median survival time.

Figure 11-2.

All population survival curves begin at 1 (100%) at time 0, when all the individuals in the study are alive, and falls to 0 as individuals die over time. The time at which 50% of the population has died is the median survival time.

Our goal is to estimate the survival function from a sample. Note that it is only possible to estimate the entire survival curve if the study lasts long enough for all members of the sample to die. When we are able to follow every member of a sample until all of them die, estimating the survival curve is easy: Simply compute the fraction of surviving individuals at each time someone dies. In this case, the estimate of the survival function from the data would simply be

where (t) is the estimate of the population survival function computed from the observations in the sample.

Unfortunately, as we have already seen on Pluto, we often do not know the length of time every individual in the sample lives, so we cannot use this approach. In particular, we need a method to estimate the survival curve from real data in the presence of censoring, when we do not know the precise times of death of all the individuals in the sample. To estimate the survival function from censored data, we need to compute the probability of surviving at each time we observe a death, based on the number of individuals known to be surviving immediately before that death.

The first step in estimating the survival function is to list all the observations in the order of the time of death or the last available observation for each individual. Table 11-1 shows these results for the data in Figure 11-1, in the order that death or loss to follow up occurred. Uncensored observations (where the actual time of death is known) are listed before censored observations. Censored observations are indicated with a “+,” indicating that the time of death is some unknown time after the last time at which the subject was observed. For example, the first death took place (Plutonian J) at time 2, and the second death (Plutonian H) took place at time 6. Two Plutonians (A and C) died at time period 7, and one more observation (Plutonian I) after time 7. Thus, we know that Plutonian I lived longer than J, H, A, and C, but we do not know how much longer.

Table 11-1. Pattern of Deaths over Time for Plutonians after Starting to Go to Smoky Bars

Plutonian	Survival Time, ti	Number Alive at Beginning of Interval, ni	Number of Deaths at End of Interval, di
J	2	10	1
H	6	9	1
A and C	7	8	2
I	7+
F	8	5	1
G	9	4	1
E	11+
B	12	2	1
D	12+

The second step is to estimate the probability of death within any time period, based on the number of subjects that survive to the beginning of each time period. Thus, just before the first Plutonian (J) dies at time 2, there are 10 Plutonians alive right before J dies. Since one dies at time 2, there are 10 − 1 = 9 survivors. Thus, our best estimate of the probability of surviving past time 2 if alive just before time 2 is

where n2 is the number of individuals alive just before time 2 and d2 is the number of deaths at time 2. At the beginning of the time interval ending at time 2, 100% of the Plutonians are alive, so the estimate of the cumulative survival rate at time 2, (2), is 1.000 × 0.900 = 0.900.

Next, we move to the time of the next death, at time 6. One Plutonian dies at time 6 and there are 9 Plutonians alive immediately before time 6. The estimate of the probability of surviving past time 6 if one is alive just before time 6 is

At the beginning of the time interval ending at time 6, 90% of the Plutonians are alive, so the estimate of the cumulative survival rate at time 6, (6), is 0.900 × 0.889 = 0.800. Table 11-2 summarizes these calculations.

Table 11-2. Estimation of Survival Curve for Plutonians after Starting to Go to Smoky Bars

Plutonian	Survival Time, ti	Number Alive at Beginning of Interval, ni	Number of Deaths at End of Interval, di	Fraction Surviving Interval, (ni − di)/ni	Cumulative Survival Rate, (t)
J	2	10	1	0.900	0.900
H	6	9	1	0.889	0.800
A and C	7	8	2	0.750	0.600
I	7+
F	8	5	1	0.800	0.480
G	9	4	1	0.750	0.360
E	11+
B	12	2	1	0.500	0.180
D	12+

Likewise, just before time 7 there are 8 Plutonians alive and 2 die at time 7. Thus,

At the beginning of the time interval ending at time 7, 80% of the Plutonians are alive, so the estimate of the cumulative survival rate at time 7, (7), is 0.800 × 0.750 = 0.600.

Up to this point, the calculations probably seem unnecessarily complex. After all, at time 7 there are 6 survivors out of 10 original individuals in the study, so why not simply compute the survival estimate as 6/10 = 0.600? The answer to this question becomes clear after time 7, when we encounter our first censored observation. Because of censoring, we know that Plutonian I died sometime after time 7, but we do not know exactly when.

The next known death occurs at time 8, when Plutonian F dies. Because of the censoring of Plutonian I, who was last observed alive at time 7, we do not know whether this individual is alive or dead at time 8. As a result, we must drop Plutonian I from the calculation of the survival function. Just before time 8, there are 5 Plutonians known to be alive when one dies at time 8, so, following the procedure outlined previously

At the beginning of the time interval ending at time 8, 60% of the Plutonians are known to be alive, so the estimate of the cumulative survival rate at time 8, (8), is 0.600 × 0.800 = 0.480. Because of the censoring, it would be impossible to estimate the survival function based on all the Plutonians who initially entered the study.

Table 11-2 presents the remainder of the computations to estimate the survival curve. This approach is known as the Kaplan–Meier product-limit estimate of the survival curve. The general formula for the Kaplan–Meier product-limit estimate of the survival curve is

where there are ni individuals alive just before time ti and di deaths occur at time ti. The Π symbol indicates the product* taken over all the times, ti, at which deaths occurred up to and including time tj. (Note that the survival curve is not estimated at the times of censored observations because no known deaths occur at those times.) For example,

Figure 11-3 shows a plot of the results. By convention, the survival function is drawn as a series of step changes, with the steps occurring at the times of known deaths. The curve ends at the time of the last observation, whether censored or not. Note that the curve, as all survival curves, begins at 1.0 and falls toward 0 as individuals die. Because one individual is still alive at the end of the study period, the data are censored and the estimated survival curve does not reach 0 during the time observations were available.

Figure 11-3.

The survival curve for Plutonians hanging out in smoky bars, computed from the data in Table 11-1 as outlined in Table 11-2. Note that the curve is a series of horizontal lines, with the drops in survival at the times of known deaths. The curve ends at 12 months because that is the survival time of the last person known to be alive is at 12 months (Plutonian D).

* The Π symbol for multiplication is used similarly to the symbol Σ for sums.

Median Survival Time

It is often desirable to provide a statistic that summarizes a survival curve with a single number. Because the survival times tend to be positively skewed, the median survival time is generally used. After the survival curve has been estimated, it is simple to estimate the median survival time.

The median survival time is defined to be the smallest observed survival time for which the estimated survival function is less than .5.†

For example, in our study of the effect of secondhand smoke on Plutonians, the median survival time is 8 months, because that is the first time at which the survival function drops below .5. (It equals .480.) If fewer than half the individuals in the study die before the end of the study, it is not possible to estimate the median survival time. Other percentiles of the survival time are estimated analogously.

† An alternative approach is to connect the two observed values above and below .5 with a straight line and read the time that corresponds to (t) = .5 off the resulting line.

Standard Errors and Confidence Limits for the Survival Curve

Like all statistics, which are based on random samples drawn from underlying populations, there is a sampling distribution of the statistic around the population parameter, in this case, the true survival function, S(t). The standard deviation of the sampling distribution is estimated by the standard error of the survival function. The standard error of the estimate of the survival curve can be estimated with the following equation, known as Greenwood’s formula:‡

where the summation (indicated by Σ) extends over all times, ti, at which deaths occurred up to and including time tj. As with estimates of the survival curve itself, the standard error is only computed using times at which actual deaths occur. For example, the standard error for the estimated value of the survival function for the Plutonians going to smoky bars at 7 months is (using the results from Table 11-2)

Table 11-3 shows all the computations for the standard errors of the survival curve using the data in Table 11-2.

Table 11-3. Estimation of Standard Error of Survival Curve and 95% Confidence Interval (CI) for Survival Curve for Plutonians after Starting to Go to Smoky Bars

Plutonian	Survival Time, ti	Number Alive at Beginning of Interval, ni	Number of Deaths at End of Interval, di	Fraction Surviving Interval, (ni − di)/ni	Cumulative Survival Rate, (t)		Standard Error, S(t)	Lower 95% CI	Upper 95% CI
J	2	10	1	0.900	0.900	0.011	0.095	0.714	1.000*
H	6	9	1	0.889	0.800	0.014	0.126	0.552	1.000*
A and C	7	8	2	0.750	0.600	0.042	0.155	0.296	0.904
I	7+
F	8	5	1	0.800	0.480	0.050	0.164	0.159	0.801
G	9	4	1	0.750	0.360	0.083	0.161	0.044	0.676
E	11+
B	12	2	1	0.500	0.180	0.500	0.151	0.000*	0.475
D	12+

The standard error can be used to compute a confidence interval for the survival function, just as we used the standard error to compute a confidence interval for rates and proportions in Chapter 7. Recall that we defined the 100(1 − α) percent confidence interval for a proportion to be

where zα is the two-tail critical value of the standard normal distribution that defines the most α extreme values, is the observed proportion with the characteristic of interest, and is its standard error. Analogously, we define the 100(1 − α) percent confidence interval for the survival curve at time tj to be

To obtain the 95% confidence intervals, α = 0.05, and zα = 1.960. Table 11-3 and Figure 11-4 show the estimated survival curve for Plutonians exposed to secondhand smoke in bars. Note the confidence interval widens as time progresses because the number of individuals remaining in the study that form the basis for the estimate of S(t) falls as people die.

Figure 11-4.

Survival curve for Plutonians hanging out in smoky bars, together with the 95% confidence interval (computed in Table 11-3). The upper and lower bounds of the 95% confidence interval are shown as light lines.

As with computation of the confidence intervals for rates and proportions, this normal approximation works well when the observed values of the survival function are not near 1 or 0, in which case the confidence interval is no longer symmetric (see Fig. 7-4 and the associated discussion). As a result, applying the previous formula for values of (t) near 1 or zero will yield confidence intervals that extend above 1 or below 0, which cannot be correct. From a pragmatic point of view, one can often simply truncate the intervals at 1 and 0 without introducing serious errors.*

‡ For a derivation of Greenwood’s formula, see Collett D. Modelling Survival Data in Medical Research. London: Chapman and Hall; 1994, 22–26.

* A better way to deal with this problem is to transform the observed survival curve according to ln [−ln (t)], which is not bounded by 0 and 1, compute the standard error of the transformed variable, then transform the result back into the survival function. The standard error of the transformed survival function is

The 100 (1−α) percent confidence interval for S(t) is

Comparing Two Survival Curves

Listen

†The end goal of much of medical practice is to prolong life, so the need to compare survival curves for groups of people receiving different treatments naturally arises in many clinical studies. We discuss how to compare the survival curves for two groups of different patients receiving different treatments. The null hypothesis we will test is that the treatments have the same effect on the pattern of survival, that is, that the two groups of people are drawn from the same population. If all individuals in the study are followed up for the same length of time and there are no censored observations, we could simply analyze the data using contingency tables as described in Chapter 5. If all individuals are followed up until death (or whatever the defining event is), we could compare the time to deaths observed in the different groups using nonparametric methods, such as the Mann-Whitney rank-sum test or Kruskal-Wallis analysis of variance based on ranks, described in Chapter 10. Unfortunately, in clinical studies of different treatments, these situations rarely hold. People are often lost to follow-up and the study often ends while many of the people in the study are still alive (or event free). As a result, some of the observations are censored and we need to develop appropriate statistical hypothesis testing procedures that will account for the censored data. We will use the log rank test.

There are three assumptions that underlie the log rank test.

The two samples are independent random samples.
The censoring patterns for the observations are the same in both samples.
The two population survival curves exhibit proportional hazards, so that they are related to each other according to S2(t)=[S1(t)]ψ where ψ is a constant called the hazard ratio.