Skip to Main Content

We have a new app!

Take the Access library with you wherever you go—easy access to books, videos, images, podcasts, personalized features, and more.

Download the Access App here: iOS and Android


Logistic regression makes it possible to investigate the relationship between multiple independent variables and a dichotomous dependent variable. As illustrated by the examples in Chapter 12, such an analysis is often done to investigate the determinants of the presence of some disease or the effectiveness of some therapy, such as investigating whether bone cancer patients respond to chemotherapy. Many such questions are answered using clinical trials in which patients are recruited into a study, randomized to one treatment or another, and then followed for some time to observe the outcome. Because the chances of something happening (disease developing or therapy failing) generally increase as time passes, it is important that all subjects be observed for the same length of time and that the outcomes in all subjects are known when analyzing the results of such a study using logistic regression. Although there is nothing wrong in theory (and often in practice) with this approach, there are often situations in which it is not practical to follow all the experimental subjects for the same length of time. Likewise, in longitudinal epidemiological studies, people are followed forward in time following exposure to some potential toxin (e.g., secondhand smoke) to see if they subsequently develop a disease (breast cancer). In both cases, a thorough analysis needs to take into account the length of time that a subject has been in the study.

In addition, there are situations in which we do not know the ultimate outcome for all the individuals in the study because the study ended before the final outcome had been observed in all the study subjects or because the outcome in some of the individuals is not known. The most common type of study in which we have incomplete knowledge of the outcome is clinical trials or survival studies in which individuals enter the study and are followed over time until some event—typically death or development of a disease—occurs. Because such studies do not go on forever, it is possible that the study will end before the event of interest has occurred in all the study subjects. In such cases, we have incomplete information about the outcomes in these individuals. In clinical trials, it is also common to lose track of patients who are being followed over time. Thus, we would know that the patient was free of disease up until the last time that we observed them but do not know what happened later. In both cases, we know that some individuals in the study were event-free for some length of time but not the actual time when an event occurred. These subjects are lost to follow-up, and such data are known as censored data. Censored data are most common in clinical trials or longitudinal epidemiological studies.

We now turn our attention to developing a technique for dealing with these two problems, known as the Cox proportional hazards model. This model ...

Pop-up div Successfully Displayed

This div only appears when the trigger link is hovered over. Otherwise it is hidden from view.