Skip to Main Content


At the end of the chapter, the reader will be able to:


  1. Identify the two major types of data for pharmacoepidemiologic studies

  2. Discuss the relative advantages and disadvantages of secondary data

  3. Describe the various sources of data for pharmacoepidemiologic studies

  4. Describe various coding schemes for drugs, procedures, and diagnoses

  5. Describe methods for measuring exposure and outcomes

  6. Discuss some special considerations when using secondary data


After the research question and study design are identified, an appropriate source of data must be identified. Pharmacoepidemiology studies may involve either data collected prospectively (i.e., primary data) for the purpose of the study or data that were already collected for some other purpose (i.e., secondary data). Primary data are collected for a specific purpose and represent data not previously available in a consolidated manner. This type of data may be collected through a variety of means, including questionnaires, interviews, or chart reviews. Primary data generally offer increased control over the type and amount of information that is available when compared to secondary data. If you need information about some specific medication-taking behavior, for example, how frequently a dose of medication is taken with a meal, you can ask this of a participant in a study that uses primary data collection. This information would most likely not be available in secondary data that comes from prescription dispensing data provided by a pharmacy. Although manual chart reviews can provide an increased level of detail in the collected data, it can be extremely time consuming and can quickly become cost prohibitive from a time and/or financial standpoint, as the sample size increases and additional chart reviewers are required. In a similar fashion, conducting interviews can be extremely time consuming despite the rich data that can be generated. This relatively high cost from a time and financial standpoint is often considered one of the limitations of using primary data.1,2


Secondary data are comprised of preexisting data that were collected for some other purpose, such as for a previous research question (e.g., a randomized controlled trial [RCT]) or to facilitate some process (e.g., hospital discharge records). These secondary data may offer a distinct advantage in terms of efficiency, when compared to primary data, because extended time need not be devoted to data collection. Depending on the particular data source, secondary data may also offer advantages in terms of sample size and generalizability.3 These strengths have resulted in the use of secondary data to study a wide variety of topics, including physician prescribing, drug utilization, and medication adherence; unintended drug effects (both adverse and beneficial); and health policy issues.4 There are some potential limitations to using secondary data for research purposes. Although some secondary data may have been originally collected for research purposes or with research in mind (e.g., RCT datasets or patient registries), the most frequently acknowledged limitation of secondary data is that secondary data are typically not collected for the ...

Pop-up div Successfully Displayed

This div only appears when the trigger link is hovered over. Otherwise it is hidden from view.