At the end of the chapter, the reader will be able to:
Explain what a variable is and differentiate between an independent variable and a dependent variable
Describe different approaches to classifying variables
Distinguish between descriptive and inferential statistics
Discuss different methods to summarize data and to describe the relationships between two variables
Distinguish between point estimation and interval estimation
Utilize key concepts related to hypothesis testing to arrive at statistical decisions and describe the relationship between hypothesis testing and confidence intervalestimation
Discuss various statistical tests that can be used to describe the significance of group differences and appreciate the factors that are important in choosing an appropriate test
Describe how linear regression, logistic regression, and survival analysis (e.g., Cox regression) are used in pharmacoepidemiology and state the nature of the dependent variable as well as the commonly reported measure of association for each technique
Differentiate the concepts of confounding, mediation, and effect modification (interaction)
Appreciate the issues involved in estimating the sample size required for a pharmacoepidemiologic study
In the conduct of a pharmacoepidemiologic study, large amounts of data are typically collected. Investigators are charged with appropriately summarizing these data to provide information and aid in decision making. Statistics provides a set of tools for performing these tasks, and the purpose of this chapter is to introduce the reader to the role of statistics, specifically biostatistics, in the analysis of data generated from pharmacoepidemiologic research. Many different statistical tests are available, and there are even different underlying philosophical schools of thought regarding statistical inference. While occasionally commenting on other approaches, this chapter will focus on commonly used methods, as it is not possible to cover the entirety of statistics in a single chapter. The intent is to enhance the statistical literacy of the reader; that is, the ability to understand statistics and to critically evaluate statistical issues in the pharmacoepidemiology literature. The focus is not about demonstrating how one conducts statistical analyses, but rather the interpretation of results from such analyses. While some formulas will be presented, these are only for illustrative purposes, and computational approaches for more complex techniques are avoided entirely. Coverage of more advanced techniques will primarily be based on a series of case studies that describe the use of these techniques in published studies.
Biostatistics is the application of statistical methods to the medical and health sciences, including epidemiology. Although biostatistics reflects an application of statistics, biostatisticians have also advanced statistical theory and methods by addressing issues and concerns common in medicine and the health sciences. In defining the broader discipline of statistics, Barnett describes statistics as “the study of how information should be employed to reflect on, and give guidance for action in, a practical situation involving uncertainty.”1(p4) Although this definition makes a number of interesting points, perhaps the most meaningful word in this definition is uncertainty. Pharmacoepidemiology is concerned with arriving at estimates, which may be descriptive ...