People often wish to compare the effects of several different treatments on an outcome variable. Sometimes the different treatments are applied to different individuals, and sometimes all the experimental subjects receive all the treatments. Although commonly analyzed statistically with multiple t tests, such data should be analyzed using an analysis of variance, followed by an appropriate multiple comparisons procedure to isolate treatment effects.* Analyzing a single set of data with multiple t tests both decreases the power of the individual tests and increases the risk of reporting a false positive. In this chapter, we will analyze so-called completely randomized experiments, in which one observation is collected on each experimental subject, selected at random from the population of interest.†
Every treatment is applied to a different individual, so some of the variability in the data is due to the treatments and some of the variability is due to random variation among the experimental subjects. The purpose of an analysis of variance is to discern whether the differences associated with the treatments are large enough, compared to the underlying random variability, to warrant concluding that the treatments had an effect. This chapter develops one-way, or single factor, analysis of variance when we wish to compare several different treatments, say the effects of several drugs on blood pressure.
We will generalize these ideas in Chapter 9 to include two-way, or two factor, designs when the experimental treatments are compared after taking into account some other factor, such as the effect of a drug on blood pressure, accounting for whether the people taking the drug are male or female.
We will approach analysis of variance as a special case of multiple regression. This approach has the advantage of allowing us to transfer all we know about what constitutes a good regression model to our evaluation of whether we have a good analysis of variance model. In addition, the multiple linear regression approach yields estimates of the size of the treatment effects—not just an indication of whether or not such effects are present—and directly provides many of the pairwise multiple comparisons. Finally, when we consider more complicated analysis of variance and analysis of covariance designs in Chapters 9, 10, and 11, a regression approach will make it easier to understand how these analyses work and how to analyze experiments in which some observations are missing. In fact, it is in these situations that the regression approach to analysis of variance is particularly useful.
Because analysis of variance and linear regression are generally presented as distinct statistical methods in most introductory texts, it may seem strange to assert that one can formulate analysis of variance problems as equivalent regression problems. After all, analysis of variance is presented as a technique for testing differences between mean values of a variable of interest in the presence of ...