# Chapter 3. Summarizing Data & Presenting Data in Tables & Graphs

- All observations of subjects in a study are evaluated on a scale of measurement that determines how the observations should be summarized, displayed, and analyzed.
- Nominal scales are used to categorize discrete characteristics.
- Ordinal scales categorize characteristics that have an inherent order.
- Numerical scales measure the amount or quantity of something.
- Means measure the middle of the distribution of a numerical characteristic.
- Medians measure the middle of the distribution of an ordinal characteristic or a numerical characteristic that is skewed.
- The standard deviation is a measure of the spread of observations around the mean and is used in many statistical procedures.
- The coefficient of variation is a measure of relative spread that permits the comparison of observations measured on different scales.
- Percentiles are useful to compare an individual observation with a norm.
- Stem-and-leaf plots are a combination of frequency tables and histograms that are useful in exploring the distribution of a set of observations.
- Frequency tables show the number of observations having a specific characteristic.
- Histograms, box plots, and frequency polygons display distributions of numerical observations.
- Proportions and percentages are used to summarize nominal and ordinal data.
- Rates describe the number of events that occur in a given period.
- Prevalence and incidence are two important measures of morbidity.
- Rates must be adjusted when populations being compared differ in an important confounding factor.
- The relationship between two numerical characteristics is described by the correlation.
- The relationship between two nominal characteristics is described by the risk ratio, odds ratio, and event rates.
- Number needed to treat is a useful indication of the effectiveness of a given therapy or procedure.
- Scatterplots illustrate the relationship between two numerical characteristics.
- Poorly designed graphs and tables mislead in the information they provide.
- Computer programs are essential in today’s research environment, and skills to use and interpret them can be very useful.

Pulmonary embolism (PE) is a leading cause of morbidity and mortality. Clinical features are nonspecific and a certain diagnosis is often difficult to make. Attempts to simplify and improve the diagnostic process in evaluating patients for possible PE have been made by the introduction of two components: determination of pretest probability and d-dimer testing. Pretest probability is determined by developing explicit criteria for determining the clinical probability of PE. d-dimer assays measure the formation of d-dimer when cross-linked fibrin in thrombi is broken down by plasmin.Elevated levels of d-dimer can be used to detect deep venous thrombosis (DVT) and PE. Some d-dimer tests are very sensitive for DVT and a normal result can be used to exclude venous thromboembolism.

Kline and colleagues (2002) wished to develop a set of clinical criteria that would define a subgroup of patients with a pretest probability of PE of greater than 40% (high-risk group). These patients would be at too great a risk of experiencing a PE to have the diagnosis excluded on the basis of d...