The first statistical problem we posed in this book, in connection with Figure 1-2A, dealt with a drug that was thought to be a diuretic, but that experiment cannot be analyzed using our existing procedures. In it, we selected different people and gave them different doses of the diuretic, then measured their urine output. The people who received larger doses produced more urine. The statistical question is whether the resulting pattern of points relating urine production to drug dose provided sufficient evidence to conclude that the drug increased urine production in proportion to drug dose. This chapter develops the tools for analyzing such experiments. We will estimate how much one variable increases (or decreases) on the average as another variable changes with a regression line and quantifies the strength of the association with a correlation coefficient.*
* Simple linear regression is a special case of the more general method of multiple regression in which case there are multiple independent variables. For a discussion of multiple regression and related procedures written in the same style as this book, see Glantz SA, Slinker BK. Primer of Applied Regression and Analysis of Variance, 2nd ed. New York: McGraw-Hill; 2001.
As in all other statistical procedures, we want to use a sample drawn at random from a population to make statements about the population. Chapters 3 and 4 discussed populations whose members are normally distributed with mean μ and standard deviation σ and used estimates of these parameters to design test statistics (such as F and t) that permitted us to examine whether or not some discrete treatment was likely to have affected the mean value of a variable of interest. Now, we add another parametric procedure, linear regression, to analyze experiments in which the samples were drawn from populations characterized by a mean response that varies continuously with the size of the treatment. To understand the nature of this population and the associated random samples, we return again to Mars, where we can examine the entire population of 200 Martians.
Figure 2-1 showed that the heights of Martians are normally distributed with a mean of 40 cm and a standard deviation of 5 cm. In addition to measuring the heights of each Martian, let us also weigh each one. Figure 8-1 shows a plot in which each point represents the height x and weight y of one Martian. Since we have observed the entire population, there is no question that tall Martians tend to be heavier than short Martians.
The relationship between height and weight in the population of 200 Martians, with each Martian represented by a circle. The weights at any given height follow a normal distribution. In addition, the mean weight of Martians at any given height increases linearly with height, and the variability in weight at any given ...