I have always thought of myself as something of an outsider and
troublemaker, so it is with some humility that I prepare the seventh edition
of this book, 30 years after the first edition appeared. Then, as now, the
book had an unusual perspective: that many papers in the medical literature
contained avoidable errors. At the time, the publisher, McGraw-Hill,
expressed concern that this “confrontational approach” would put
off readers and hurt sales. They also worried that the book was not
organized like a traditional statistics text.
Time
has shown that the biomedical community was ready for such an approach and
the book has achieved remarkable success.
The
nature of the problems with the medical literature, however, has evolved
over time and this new edition reflects that evolution. Many journals now
have formal statistical reviewers so the kinds of simple errors that used to
dominate have been replaced with more subtle problems of biased samples and
underpowered studies (although there are still more than enough
inappropriate t tests to go around). Over time, this book has evolved
to include more topics, such as power and sample size, more on multiple
comparison procedures, relative risks and odds ratios, and survival
analysis.
In this edition I actually pruned back
the discussion of multiple comparison testing to focus on Bonferonni, Holm,
and Holm-Sidak corrected tests for both parametric and nonparametric
methods.
At the same time, this is the most
extensive revision done for a new edition since the book was first
published. The book is now published in a larger, more open text format with
more worked out examples. There are new brief introductions to higher order
analysis of variance, multiple regression and logistic regression,* as well
as expanded discussions of problems with study designs and more information
on how to combine information from many different studies. The examples and
problems have been extensively reworked, with almost all coming from studies
published in the twenty-first century.
This book
has its origins in 1973, when I was a postdoctoral fellow. Many friends and
colleagues came to me for advice and explanations about biostatistics. Since
most of them had even less knowledge of statistics than I did, I tried to
learn what I needed to help them. The need to develop quick and intuitive,
yet correct, explanations of the various tests and procedures slowly evolved
into a set of stock explanations and a two-hour slide show on common
statistical errors in the biomedical literature and how to cope with them.
The success of this slide show led many people to suggest that I expand it
into an introductory book on biostatistics, which led to the first edition
of Primer of Biostatistics in 1981.
As a
result, this book is oriented as much to the individual reader—whether
he or she is a student, postdoctoral research fellow, professor, or
practitioner—as to the student attending formal lectures.
This book can be used as a text at many levels. It has
been the required text for the biostatistics portion of the epidemiology and
biostatistics course required of medical students, covering the material in
the first eight chapters in eight one-hour lectures. The book has also been
used for a more abbreviated set of lectures on biostatistics (covering the
first three chapters) given to our dental students. In addition, it has
served me (and others) well in a one-quarter four-unit course in which we
cover the entire book in depth. This course meets for four lecture hours and
has a one-hour problem session. It is attended by a wide variety of
students, from undergraduates through graduate students and postdoctoral
fellows, as well as faculty members.
Because this
book includes the technical material covered in any introductory statistics
course, it is suitable as either the primary or the supplementary text for a
general undergraduate introductory statistics course (which is essentially
the level at which this material is taught in medical schools), especially
for a teacher seeking a way to make statistics relevant to students majoring
in the life sciences.
This book differs from other
introductory texts on biostatistics in several ways, and it is these
differences which seem to account for the book's enduring popularity.
First, because inappropriate use of the t test to
analyze multigroup studies continues to be a common error, probably because
the t test is usually the first procedure presented in a statistics
book that will yield the highly prized P value. Analysis of variance,
if presented at all, is deferred to the end of the book to be ignored or
rushed through at the end of the term. Since so much is published that
probably should be analyzed with analysis of variance, and since analysis of
variance is really the paradigm of all parametric statistical tests, I
present it first, then discuss the t test as a special case.
Second, in keeping with the problems that I see in the
literature, there is a discussion of multiple comparison testing.
Third, the book is organized around hypothesis testing and
estimation of the size of treatment effects, as opposed to the more
traditional (and logical from a theory of statistics perspective)
organization that goes from one-sample to two-sample to general k-sample
estimation and hypotheses testing procedures. This approach goes directly to
the kinds of problems one most commonly encounters when reading about or
doing biomedical research.
The examples are based
mostly on interesting studies from the literature and are reasonably true to
the original data. I have, however, taken some liberty in recreating the raw
data to simplify the statistical problems (for example, making the sample
sizes equal) so that I could focus on the important intuitive ideas behind
the statistical procedures rather than getting involved in the algebra and
arithmetic. There are still some topics common in introductory texts that I
leave out or treat implicitly. There is not an explicit discussion of
probability calculus and expected values and I still blur the distinction
between P and α.
As with any book,
there are many people who deserve thanks. Julien Hoffman gave me the first
really clear and practically oriented course in biostatistics, which allowed
me to stay one step ahead of the people who came to me for expert help. Over
the years, Virgina Ernster, Susan Sacks, Philip Wilkinson, Marion Nestle,
Mary Giammona, Bryan Slinker, Jim Lightwood, Kristina Thayer, Joaquin
Barnoya, Jennifer Ibrahim, and Sara Shain helped me find good examples to
use in the text and as problems. Bart Harvey and Evelyn Schlenker were
particularly gracious in offering suggestions and detailed feedback on the
new material in this edition. I thank them all. Finally, I thank the many
others who have used the book, both as students and as teachers of
biostatistics, who took the time to write me questions, comments, and
suggestions on how to improve it. I have done my best to heed their advice
in preparing this seventh edition.
Many of the
pictures in this book are direct descendants of my original slides. In fact,
as you read this book, you would do best to think of it as a slide show that
has been set to print. Most people who attend my slide show leave more
critical of what they read in the biomedical literature and people who have
read earlier editions said that the book had a similar effect on them.
Nothing could be more flattering or satisfying to me. I hope that this book
will continue to make more people more critical and help improve the quality
of the biomedical literature and, ultimately, the care of people.
Stanton A. Glantz
* These
issues are treated in detail in a second book on the subject of multiple
regression and analysis of variance, written with the same approach in Primer
of Biostatistics. It is Glantz SA, Slinker BK. Primer of Applied
Regression and Analysis of Variance, 2nd ed. New York: McGraw-Hill;
2001.