# Lesson 12: Inference for Two Means: Paired Data

These optional videos discuss the contents of this lesson.

## Contents

## 1 Lesson Outcomes

By the end of this lesson, you should be able to:

- Confidence Intervals for the mean of differences with dependent samples:
- Calculate and interpret a confidence interval for the mean of differences given a confidence level.
- Identify a point estimate and margin of error for the confidence interval.
- Show the appropriate connections between the numerical and graphical summaries that support the confidence interval.
- Check the requirements for the confidence interval.

- Hypothesis Testing for the mean of differences with dependent samples:
- State the null and alternative hypothesis.
- Calculate the test-statistic, degrees of freedom and p-value of the hypothesis test.
- Assess the statistical significance by comparing the p-value to the α-level.
- Check the requirements for the hypothesis test.
- Show the appropriate connections between the numerical and graphical summaries that support the hypothesis test.
- Draw a correct conclusion for the hypothesis test.

- Confidence Intervals for the mean of differences with dependent samples:

## 2 Example of Paired Data: Pre- and Post-test Scores

In education, it is very common for researchers to conduct studies in which they administer a pre-test, provide some instruction, and then give a post-test. The difference between the post- and pre-test scores is a measure of the student's progress. In this case, it would not make much sense to only look at the mean score on the pre-test and compare it to the mean score on the post-test.

This is called a **matched-pairs** design or we say we have **dependent samples**. Matched-pairs (or **paired-data**) designs typically involve only one population, and a pair of observations is drawn on the individuals selected for the sample. In the context of the educational study, the two observations are student's scores on (1) the pre-test and (2) the post-test. If a student is selected to participate in the pre-test (i.e., they are selected to be part of group 1), they are automatically selected to participate in the post-test (i.e., they are chosen to be in group 2 automatically.)

There is a lot of merit in subtracting the individual scores and looking at the mean *gain*.
The researchers are not really interested in the students knowledge before the instruction. This is used as a baseline to measure how much was gained during the instruction. There is great value in looking at the difference. This removes the effect of the individual students' ability, and it measures their learning during the unit.

To analyze the data, the researchers first find the difference in the post- and pre-test scores. At that point, the data have been reduced to a list of numbers (representing the increase in scores.) Now, the researchers can conduct inference on the mean of these values. In other words, they can do a hypothesis test for the mean of the difference in the post- and pre-test scores.

A hypothesis test for two means with paired data (dependent samples) is conducted in the same way as a hypothesis test for a single mean with $\sigma$ unknown. The only exception is that the pairs of data must be subtracted before you start any computations. From a practical perspective, after you subtract, then you apply the one-sample procedures you have already learned. So, there is nothing new that you need to learn to compute a confidence interval for two means with paired data...except how to subtract in

We will first explore an application of pre- and post-testing in a weight loss study.

## 3 Hypothesis Tests

### 3.1 Mahon's Weight Loss Study

**Background**

Annie Mahon and other researchers in Wayne Campbell's nutrition lab studied the weight loss of $n=27$ middle aged women who consumed a prescribed low-calorie diet. The women's weights were recorded (in kilograms) at the beginning of the study and after the nine-week diet period. The data are given in the file Mahon. An excerpt of the data is given below.

Subject | Pre | Post |
---|---|---|

1 | 62.5 | 56.1 |

2 | 88.8 | 80.2 |

3 | 74.7 | 70.8 |

$\vdots$ | $\vdots$ | $\vdots$ |

26 | 76.3 | 73.8 |

27 | 82.1 | 77.9 |

Notice the structure of the data. The weight of each subject was measured before the study and at the conclusion of the study. Each person provided a pre-study weight and a post-study weight. Stated differently, the pre-study weights and the post-study weights are paired. For each row of data, both of these numbers came from the same person. When we collect two observations of the same measurement on each subject, we call it **paired data**. Sometimes paired data are called **dependent samples**.

- 1. The researchers measured the initial weights of the women prior to the study, even though they were not particularly interested in this value. What was the purpose of measuring the pre-study weights?

- The goal of the study is to determine how much the women's weight change as as result of the study. The researchers must measure the women's weights at the beginning of the study, so they can subtract the initial (pre-study) weight of each woman from her final (post-study) weight.

**Computing New Variables in **

Annie Mahon and her research team are interested in the difference of the weights after the study compared with before: $$ \text{Difference} = \text{Post} - \text{Pre} $$

Appending the column of differences to the table above, we have:

Subject | Post | Pre | Difference |
---|---|---|---|

1 | 56.1 | 62.5 | 56.1 $-$ 62.5 = -6.4 |

2 | 80.2 | 88.8 | 80.2 $-$ 88.8 = -8.6 |

3 | 70.8 | 74.7 | 70.8 $-$ 74.7 = -3.9 |

$\vdots$ | $\vdots$ | $\vdots$ | $\vdots$ |

26 | 73.8 | 76.3 | 73.8 $-$ 76.3 = -2.5 |

27 | 77.9 | 82.1 | 77.9 $-$ 82.1 = -4.2 |

- Here is how you can subtract two columns of data in Excel:

- If the data are in columns, you might want to give a label to the new column, such as "Differences"
- Within Excel, click on the cell where you want the difference to be calculated. Typically, this will be adjacent to the two values you want to subtract.
- Type an equal sign (=)
- Click on the cell containing the first number to be subtracted
- Then type the subtraction sign (-)
- Now, click on the cell containing the second value to be subtracted.

- The following image shows the subtraction of the pre-study weights from the post-study weights of Mahon's volunteers (The post-study will not always be on the left, so pay attention to how you subtract, post - pre):

- When you click elsewhere, the difference will be computed.
- If the data are in columns, you can easily compute the difference for the remaining data values.

- Select the cell containing the difference you just computed.
- Copy the value in the cell ([Ctrl]-c is the keyboard shortcut for PCs.)
- Then simultaneously select all the cells in which you want the data to be pasted.
- Finally, paste the formula into these cells ([Ctrl]-v is the keyboard shortcut for PCs.)

- You have now computed the column of differences. Your file should look like this when you are finished:

- If you want to remove the formulas to make it easier to paste the differences into the QuantitativeInferentialProcedures.xls file, do the following:

- Now, the cells contain the subtracted differences, rather than the equation for these differences. You are now ready to perform the calculations for a hypothesis test. The hypothesis test will be conducted using the file QuantitativeInferentialProcedures.xls.
- Copy the first 2 columns of data that you generated above
- Open the file QuantitativeInferentialProcedures.xls
- Click on the tab "Paired Sample t-test", which is located at the bottom of the Excel window
- Paste the data in columns A, B, and C, with the first data value in row 5

- Your file should look like this:

The researchers are not interested in the weights of the women, they are more interested in the *change* in the women's weights. This will give them a measure of the effectiveness of the low-calorie diet. Notice that in this weight loss study, the change in the weights is negative. This indicates that the final weight was lower than the initial weight.

- 2. Following the directions above, compute the difference in the women's weights by subtracting the pre-study weights from the post-study weights using software. Call this new column
*Difference*.

- 3. What is the mean of the values in the
*Difference*column?

$ -6.80 \text{ kg} $

- 4. Interpret the value you calculated in Question 3.

- The mean weight change experienced by the women in the study was $-6.80$ kg. In other words, the mean weight loss was $6.80$ kg.

**Relationship to a One Sample t-test**

After you have subtracted the pre-study weights from the post-study weights, you are left with a column of differences. We will denote the pre-study weights by $ x_1 $ and the post-study weights by $ x_2 $. Then, the differences can be denoted as $ d = x_2 - x_1 $. The difference, $d$, is defined as the change in the volunteer's weight during the study.

After computing the differences, we do not use the data for the individual groups at all. The researchers are not interested in the values of the women's weights at the beginning of the study or at the end of the study. They are mostly interested in the difference in the weights after the participants complete the study.

After we subtract, we can conduct a hypothesis test to determine if the mean of the differences is less than zero. We use the symbol $ \mu_d $ to represent the true mean difference in the weights of the women who follow the diet prescribed in this study. The null hypotheses is that the true mean difference is zero ($\mu_d = 0$). The alternative hypothesis is that there is a decrease in the weights, in other words, that the true mean difference is less than zero ($\mu_d < 0$).

Notice that this is essentially a one-sample t-test where the data are the differences in the women's weights. We have one column of data, the differences. We are testing whether the true mean difference is less than zero. After subtracting, a test for a difference of two means with paired data is just like a test for one mean with $ \sigma $ unknown.

In the hypothesis test, we will refer to the variable representing the differences as $d$. We will use this notation throughout the hypothesis test. For example, the true population mean will be labeled $\mu_d$ and the sample mean will be labeled $\bar d$. The sample standard deviation of the differences is denoted $s_d$.

**Hypothesis Test for Mahon's Weight Loss Data**

**Summarize the relevant background information**

Twenty-seven women participated in a nine week weight loss study. During the study period, the participants were provided a reduced calorie diet. Their weights were recorded at the beginning of the study and nine weeks later. The difference of the weights is defined as the post-study weights minus the pre-study weights. The researchers expected that the mean difference in the weights would be negative--in other words, that the women would tend to lose weight.

**State the null and alternative hypotheses and the level of significance**

$$ \begin{align} H_0: &~~ \mu_d=0 \\ H_a: &~~ \mu_d < 0 \end{align} $$

We will use the $ \alpha = 0.05 $ level of significance.

**Describe the data collection procedures**

The women's weights were recorded at the beginning of the study. The women were provided a reduced calorie diet for nine weeks. Then, their weights were measured again at the end of the study. A calibrated scale was used to provide an accurate weight.

**Give the relevant summary statistics**

From the Excel output illustrated above, we get the following:

$$ \begin{align} \bar d &= -6.80 \\ s_d &= 3.17 \\ n &= 27 \end{align} $$

The mean and standard deviation are rounded to one decimal place more than the original data.

**Make an appropriate graph (histogram) to illustrate the data**

This histogram was created in Excel with seven bins:

**Verify the requirements have been met**

Like the one-sample t-test, this procedure is robust, meaning that it is not very sensitive to the requirements. If they are violated, it will probably still give reasonably good results.

The requirements for this procedure are the same as the requirements for a one-sample t-test:

- the data represent a simple random sample from the population
- the mean of the differences follows a normal distribution

The subjects were recruited via advertisements for a research study. The participants volunteered to participate. It is not a simple random sample of all middle-aged women, but there is nothing about the selection of the sample that would invalidate the results.

From a practical perspective, it is impossible to get a simple random sample of people in the general population. When research trials are conducted, people must volunteer to participate. This can lead to a selection bias, but it is usually negligible.

The requirement of normality is satisfied for Mahon's data. The differences appear to follow a normal distribution, so $\bar d$ will be approximately normal.

The sample size (n=27) is fairly large. The histogram shows a mound shape. Here is a Q-Q plot of the differences:

With this Q-Q plot, we could conclude that the data follow a normal distribution. Even if we had had a small sample size, we could still conduct this test.

**Give the test statistic and its value**

The test statistic for a test involving paired data when $\sigma$ is unknown is a $t$. For this situation, the value is: $$ t= \frac{-6.8 - 0}{3.17/\sqrt{27}} =-11.145 $$

**State the degrees of freedom**

$$ df = 26 $$

**Mark the test statistic and $P$-value on a graph of the sampling distribution**

The test statistic, $t$, is labeled on the horizontal axis. The $P$-value is the area to the left of $t$ under the curve. This area is so small, it is not illustrated on this plot. Please note that this image was taken from the normal probability applet. The test statistic for two means with paired data is a $t$. This image was designed for a $z$-curve. This is not the right image for this procedure. For this reason, we labeled the image, "For illustrative purposes only".

It is important to note that only the left tail is shaded, even though we cannot see it in this illustration.

**Find the $P$-value and compare it to the level of significance**

$$ P\text{-value} = 1.06 \times 10^{-11} < 0.05 = \alpha $$

**State your decision**

Since the $P$-value is less than the level of significance, we reject the null hypothesis.

**Present your conclusion in an English sentence, relating the result to the context of the problem**

There is sufficient evidence to suggest that the reduced calorie diet used in this study results in weight loss for middle-aged women.

### 3.2 Nosocomial Infections

**Summarize the relevant background information**

Matched-pairs designs are not just used in pre- and post-test situations. They are often used in situations where it is not possible to randomly assign subjects to groups (for example, by a coin toss.) Nosocomial (pronounced: NO-suh-KOH-MEE-uhl) infections are infections that occur in hospitals, but are not a result of the original condition. An example of a nosocomial infection is when a heart attack patient develops a staph infection at the site of an IV injection. The infection was not caused by the heart attack, but it was acquired in the hospital. Nosocomial infections are very dangerous and may result in longer recovery times or increased death rates.

Health care providers suspect that nosocomial infections increase the amount of time required to recover from an illness or injury. In controlled experiments, subjects (e.g., patients) are randomly assigned to treatments. However, it is not ethical to give patients a nosocomial infection in order to determine if it increases the duration of their hospital stay! At best, we can collect information on the duration of hospital stays for patients who acquire nosocomial infections and compare them to the duration of the stays for patients who do not.

There are many factors that affect the amount of time that a patient will need to stay in the hospital, including: nature of illness, types of procedures conducted, overall health, gender, age, etc. How can health care practitioners assess the effect of a nosocomial infection in the presence of so many other variables?

One way is to match a patient who develops a nosocomial infection with another one who has similar characteristics (illness, procedures, health, gender, age group, etc.) but does not develop a nosocomial infection. Now, the patients are matched into pairs with similar characteristics, where the principle difference between the members of each pair is whether or not they acquired a nosocomial infection.

By pairing the patients according to specific characteristics, the researchers can now subtract to observe a difference in their recovery times. In this way, it is possible to assess if nosocomial infections increase the mean duration of a hospital stay. Some researchers conducted such a study in which 52 pairs of patients were matched based on clinical characteristics. A patient with a nosocomial infection was matched as closely as possible to a similar case where there was no nosocomial infection. Patients who died were excluded from the study . The lengths of the hospital stays (in days) for these patients are given in the file NosocomialInfections.

The difference, $d$, is defined as the duration of the hospital stay of the individual in the pair with the nosocomial infection minus the duration of the stay for the individual who did not get a nosocomial infection: $$ Difference=Infected - NotInfected$$ After computing the differences, we do not use the data for the individual groups at all. In fact, after we subtract, the hypothesis test is conducted (essentially) like a one-sample test for a single mean with $\sigma$ unknown.

- 5.
**State the null and alternative hypotheses and the level of significance**

$ \begin{align} H_0: &~~ \mu_d = 0 \\ H_a: &~~ \mu_d > 0 \\ \end{align} $

- The level of significance was not specified in the problem. You can choose any value you wish. The most common choices are 0.05, 0.01 and 0.1. We will illustrate this example with $\alpha = 0.05$.

- 6.
**Describe the data collection procedures**

- Data were collected by matching hospital records of individuals who were admitted to the hospital. Patient records were matched based on their overall health and the reason they were admitted to the hospital. In each pair, one patient developed a nosocomial infection and one did not. Since the characteristics of the patients in the first group determined which patients would be paired with them in the second group, the data represent dependent samples.

- 7.
**Give the relevant summary statistics**

$ \begin{align} \bar d &= 11.38 \\ s_d &= 13.83 \\ n &= 52 \end{align} $

- 8.
**Make an appropriate graph to illustrate the data**

- Present a graph showing the differences.

- 9.
**Verify the requirements have been met**

- The data represent a random sample of patients, who have been matched based on their overall health and their current ailment. The sample size is large, so the mean of the differences $ \bar d $ will be approximately normally distributed.

- 10.
**Give the test statistic and its value**

- The test statistic for a test for two means with paired data is a $t$.

$$ t = 5.935 $$

- 11.
**State the degrees of freedom**

$ df = 51 $

- 12.
**Mark the test statistic and $P$-value on a graph of the sampling distribution**

- Your sketch should show the value of $t=5.935$ on the horizontal axis, with only the tiny area to the right of 5.935 shaded.

- 13.
**Find the $P$-value and compare it to the level of significance**

$ P\textrm{-value}=\frac{\textrm{Sig. (2-tailed)}}{2}=\frac{2.592\times 10^{-7}}{2}=1.296 \times 16^{-7} = 0.0000001296 < 0.05 = \alpha $

- 14.
**State your decision**

- Since the $P$-value is less than the level of significance, we reject the null hypothesis.

- 15.
**Present your conclusion in an English sentence, relating the result to the context of the problem**

- There is sufficient evidence to suggest that the mean duration of hospital stays is increased when a patient develops a nosocomial infection.

### 3.3 Additional Worked Examples

Viewing additional examples can help your understanding. Click on the link at right to see two more examples of hypothesis tests.

#### 3.3.1 Effect of Stressful Classical Music on Your Metabolism

**Summarize the relevant background information**

Obesity is a growing problem worldwide. Many scientists are seeking creative solutions to trim down this epidemic. Reduced energy expenditure is a potential cause of obesity.

Resting Energy Expenditure (REE) is defined as the amount of energy a person would use if resting for 24 hours. In essence, this is the amount of energy that a person's body will consume if they do not do any physical activity. REE is measured in terms of kilo-Joules per day (kJ/d).

REE accounts for approximately 70 to 80% of all energy that a person will expend in a day. If researchers can find simple, enjoyable activities that will increase REE, it may be possible to minimize the spread of obesity around the world.

Ebba Carlsson and other researchers in Sweden investigated whether listening to stressful classical music increases a person's REE. Each subject's REE was measuring during silence and again while listening to stressful classical music. Data representing their results are given in the file REE-ClassicalMusic.

Notice that this is not a pre- and post-test, but it is still a test involving paired data. Two REE measurements were made for each subject: (1) in silence ($REE_1$) and (2) while listening to stressful classical music ($REE_2$).

**State the null and alternative hypotheses and the level of significance**

Since we are testing for an increase in the mean REE, we let $d = REE_2 - REE_1$. Our alternative hypothesis will be that $ \mu_d > 0 $. The null and alternative hypotheses are: $$ \begin{align} H_0: &~~ \mu_d = 0 \\ H_a: &~~ \mu_d > 0 \end{align} $$

We will use the $ \alpha = 0.1 $ level of significance.

In order to get the correct $P$-value, we need to indicate the proper alternative hypothesis in Excel. In the cell next to "Type of Test", choose "Greater Than" in the drop-down menu in the file QuantitativeInferentialProcedures.xls.

**Describe the data collection procedures**

The REE was measured by a technique called "indirect calorimetry" using a Deltatrac II Metabolic Monitor. The REE was measured twice for each person: while the person was (1) resting in silence or (2) resting while listening to stressful classical music. These trials were conducted in random order. Some of the subjects had the "silence" treatment first, and others had the "stressful" treatment first.

- 16. We will define the difference in REE by subtracting the REE in silence from the REE while listening to stressful classical music. If listening to stressful classical music actually increases the mean REE, would you expect the value of the difference to be typically positive or negative?

- If the REE is higher while listening to classical music than while resting in silence, we would expect the value of the difference to be positive. In other words the following difference would tend to be positive:

$$ Difference = Stressful - Silence $$

- 17. Compute the difference in REE for each person. What is the value of the difference for the first person listed in the data file?

- 50 kJ/d

- Here is an illustration of an excerpt of the data in Excel:

**Give the relevant summary statistics**

- 18. Report the number of subjects ($n$), the mean difference ($\bar d$), and the standard deviation of the differences ($s_d$).

- The following image illustrates the Excel file used to get the summary statistics.

$$ \begin{align} n&=40\\ \bar d &= 20~\text{kJ}\\ s_d &= 160~\text{kJ} \end{align} $$

- 19.
**Make an appropriate graph to illustrate the data**

**Verify the requirements have been met**

We can consider the sample representative of the population. The "difference" data appear to follow a normal distribution. This is illustrated in the following Q-Q plot:

The requirements for this test appear to have been satisfied.

- 20.
**Give the test statistic and its value**

- The test statistic for a test for two means with paired data is a $t$.

$$t=0.793$$

- 21.
**State the degrees of freedom**

$ df = 39 $

- 22.
**Mark the test statistic and $P$-value on a graph of the sampling distribution**

- The test statistic is plotted on the horizontal axis. The $P$-value is shaded in blue. This image was copied from the normal probability applet. Since the test statistic is a $t$, not a $z$, this is not the exact image for this procedure. We label the image, "For illustrative purposes only":

- 23.
**Find the $P$-value and compare it to the level of significance**

$ P\textrm{-value}=0.2163 > 0.1 = \alpha $

- Notice that the $P$-value is half as large for a one-tailed test as it would have been for a two-tailed test. Since we have a one-sided alternative hypothesis, we are only interested in the right tail of the $t$-distribution.

- 24.
**State your decision**

- Since the $P$-value is greater than the level of significance, we fail to reject the null hypothesis.

- 25.
**Present your conclusion in an English sentence, relating the result to the context of the problem**

- There is insufficient evidence to suggest that the mean REE is
*increased*by listening to stressful classical music. Lying still and listening to stressful classical music is probably not the best way to increase your metabolism!

- There is insufficient evidence to suggest that the mean REE is

Note that we did not say we "accept" the null hypothesis. We do not know that listening to stressful classical music has no effect on a person's REE. Based on the data available to us, we were not able to reject the requirement that this type of music does not increase the mean REE.

#### 3.3.2 Cost of Airline Tickets

**Summarize the relevant background information**

Pressures of supply and demand act directly on the prices for an airline ticket. As the seats available on the plane begin to fill, airlines raise the price. If seats on a flight do not sell well, an airline may discount the tickets or even cancel the flight. Business travelers frequently demand travel booked on short notice. They must pay the current price. Typically, tourists book their flights well in advance, hoping to buy tickets before the price rises. We will consider the cost of a one-way ticket from London's Heathrow Airport to a variety of destinations in Europe.

Allie Henrich, a BYU-Idaho student, compared the lowest published ticket prices of one-way flights from Heathrow to various destinations in Europe. Using Travelocity.com, she recorded the lowest published fares for nonstop midweek flights booked either 14 days in advance or 90 days in advance. The prices (in US dollars) are given in the file DirectFlightCosts. Notice that for some destinations, flights were not available.

The data are paired, because measuring the costs twice for each city. The 14-day ticket price is paired with the 90-day price for each city.

We will conduct a hypothesis test to determine if there is a difference in the cost of the nonstop flights when tickets are purchased 14 days in advance compared to 90 days in advance. We will use the 0.01 level of significance.

- 26.
**State the null and alternative hypotheses and the level of significance**

$ \begin{array}{1cl} H_0:\mu_d = 0 \\ H_a:\mu_d \ne 0 \\ \alpha = 0.01 \end{array} $

- 27.
**Describe the data collection procedures**

- The data were collected using the website Travelocity.com. The lowest advertized ticket prices were recorded for nonstop flights from Heathrow Airport. All prices were recorded in US dollars. Data are provided on the cost of a nonstop ticket purchased with 14 days notice compared to 90 days notice.
- We will compute the difference in the costs for each destination. Some destinations did not include both flight options. In this case, the difference is not computed and the data are omitted from the analysis.

- 28.
**Give the relevant summary statistics**

- The differences were computed by subtracting the 90-day price from the 14-day price. For example, for the Adnan Menderes Airport, we have

$$ 202.09 - 234.19 = -32.10 $$

- You may have chosen to subtract in the opposite order. If so, you would have obtained a value of $ 32.10 $ dollars.

$$ \begin{align} n&=87\\ \bar d &= 24.612\\ s_d &= 136.267 \end{align} $$

- 29.
**Make an appropriate graph to illustrate the data**

- The histogram is presented with 16 bins.

- If you defined your difference as the 90-day price minus the 14-day price, then with 16 bins you would have the following histogram:

- 30.
**Verify the requirements have been met**

- The sample size is large, so we can conclude that the sample mean, $ \bar d $ is normally distributed.

- 31.
**Give the test statistic and its value**

- The test statistic for a test for two means with paired data is a $t$.

$$ t=1.685 $$

- If you computed the difference as the 90-day price minus the 14-day price, the value of your test statistic is $ -1.685 $.

- 32.
**State the degrees of freedom**

$ df = 86 $

- 33.
**Mark the test statistic and $P$-value on a graph of the sampling distribution**

- The test statistic is plotted on the horizontal axis. The $P$-value is shaded in blue. This image was copied from the normal probability applet. Since the test statistic is a $t$, not a $z$, this is not the exact image for this procedure. We label the image, "For illustrative purposes only":

- 34.
**Find the $P$-value and compare it to the level of significance**

$ P\textrm{-value}= 0.096 > 0.01 = \alpha $

- The $P$-value will be 0.096, no matter what order you subtracted the values.

- 35.
**State your decision**

- Since the $P$-value is greater than the level of significance, we fail to reject the null hypothesis.

- 36.
**Present your conclusion in an English sentence, relating the result to the context of the problem**

- There is insufficient evidence to suggest that there is a difference in the mean cost of airline tickets 14-days versus 90-days in advance.

## 4 Confidence Intervals

We can compute a confidence interval for the true mean of the differences for paired data. After the differences between two paired data sets have been calculated, we can create a confidence interval for the true mean of the differences. To do this, we follow the instructions for creating a confidence interval for a one mean with $ \sigma $ unknown, but we use the column of differences as the data set.

**To calculate confidence intervals for the true mean of the difference in Excel, do the following**:- Follow the directions given
**above**for creating a new column containing the differences between two variables. - Open the file QuantitativeInferentialProcedures.xls
- Click on the tab labeled "One-sample t-test"
- Enter the values of the differences you calculated.
- Set the desired confidence level.

- Follow the directions given

The requirements for creating a confidence interval for the difference of means are the same as the requirements for the hypothesis test. We assume:

- A simple random sample was drawn from the population
- The mean of the differences is normally distributed

### 4.1 Mountain Pine Beetle Attacks

**Summarize the relevant background information**

Mountain pine beetles are small insects that bore into the bark of trees. The female beetles that first infest the tree emit pheromones to attract other beetles. In response to the pheromones, many beetles bore into the tree and ultimately kill it. The insects can destroy large tree stands within one year.

Lodgepole pine (*Pinus contorta* Dougl.ex Loud.) are particularly susceptible to mountain pine beetle (*Dendroctonus ponderosae* Hopkins) outbreaks. The image to the right shows the destruction that can be caused by these insects. The large brown patches are pines that have been killed by the beetles.

- 37. The mountain pine beetle threatens many forests in the United States. These tiny insects are only 0.5 cm long--about the size of a grain of rice. This photo of a mountain pine beetle is magnified greatly. These little creatures can destroy a large, healthy forest. Can you think of a spiritual parallel?

- There are many great diverse answers that could be presented. Please share your thoughts with someone in your group.

**Describe the data collection procedures**

In a study conducted in the Arapaho National Forest in Colorado, researchers from the USDA Forest Service studied the effect of pine beetle outbreaks on the average number of trees in an area. The researchers counted the number of established trees per hectare before a pine beetle outbreak and seven years after an outbreak. (One hectare is an area of 100 meters by 100 meters.) Data representative of their observations are given in the file PineBeetle.

**Give the relevant summary statistics**

- 38. Find the mean and standard deviation of the number of trees per hectare
*before*the pine beetle outbreak. How would you describe the density of the trees in this forest? Express this in terms that make sense to you.

- The mean was 1028.41 trees per hectare and the standard deviation was 57.03 trees per hectare. Note that the values were rounded to two decimal places, since the data were given to one decimal place.
- Answers will vary regarding the description of the density. Here is one possible response.
- There is roughly one tree every $ \frac{100 \times 100}{1028.41} = 9.7 $ square meters. In other words, on average, each tree would have a space of about $ \sqrt{9.7} = 3.1 $ meters long and 3.1 meters wide in which to grow.

- 39. Repeat question 38 for the number of trees per hectare
*after*the outbreak.

- The mean was 592.87 trees per hectare and the standard deviation was 45.31 trees per hectare.
- Answers will vary regarding the description of the density. Here is one possible respons.
- The trees are about half as dense as they were before the pine beetle infestation. About $ \frac{592.87}{1028.41} = 0.58 = 58\% $ of the trees remained, so $ 100\% - 58\% = 42\% $ of the trees were killed by the pine beetles!

- 40. Create a new column of data in the file PineBeetle by subtracting the "before" counts from the "after" counts:

$$ Difference = After - Before $$

- For these differences, report the mean, the standard deviation, and the sample size.

Summary Statistics: | |
---|---|

Mean: | $ \bar d = -435.535 $ |

Standard Deviation: | $ s_d = 17.082 $ |

Sample Size: | $ n = 170 $ |

**Make an appropriate graph to illustrate the data**

- 41. Create a histogram of the differences in the density of the trees.

- 42.
**Verify the requirements have been met.**

- a. It is not explicitly stated, but we assume the plots of land were selected at random.

- b. For the pine beetle data, the histogram indicates that the data are not normally distributed. This can be confirmed with a Q-Q plot. Since the sample size is large ($ n = 170 $), we can assume the sample mean is normally distributed.

- The requirements for creating the confidence interval seem to be satisfied.

- 43.
**Find the confidence interval**. Use the 95% level of confidence.

$ (-438.121,~ -432.948) $

**Present your observations in an English sentence, relating the result to the context of the problem**

Interpret the confidence interval we created.
We are 95% confident that the true mean change in the number of trees per hectare after a pine beetle outbreak is between $-438.121$ and $-432.948$ trees per hectare. Stated differently, we are 95% confident that the true mean *decrease* in the number of trees per hectare after a pine beetle outbreak is between $432.948$ and $438.121$ trees per hectare.

### 4.2 Sleep Inducing Drugs

**Summarize the relevant background information**

In William Sealy Gosset's landmark paper on the $t$-distribution, he cites data on a sleep-inducing drug. In a paper published in 1905, Arthur R. Cushny and A. Roy Peebles reported the effect of Lævorotary Hyoscyamine Hydrobromate (L-Hyoscyamine) on the length of time that people sleep before waking. The primary research question is: does L-Hyoscyamine impact the mean amount of time that people sleep? We will compute a 90% confidence for the true mean difference in the times.

**Describe the data collection procedures**

Eleven subjects were included in the study. At the start of the study, the researchers observed the average length of time that each of the participants slept before waking. Later, each subject was given 0.6 mg of L-Hyoscyamine and the duration of uninterrupted sleep was again measured.

The difference in the amount of time each person slept was computed by subtracting the amount of time the subjects slept when taking the drug minus the sleep duration with no drug. The data are summarized in the table below.

Subject | Control (no drug) | L-Hyoscyamine | Difference |
---|---|---|---|

1 | 0.6 | 1.3 | 0.7 |

2 | 3 | 1.4 | -1.6 |

3 | 4.7 | 4.5 | -0.2 |

4 | 5.5 | 4.3 | -1.2 |

5 | 6.2 | 6.1 | -0.1 |

6 | 3.2 | 6.6 | 3.4 |

7 | 2.5 | 6.2 | 3.7 |

8 | 2.8 | 3.6 | 0.8 |

9 | 1.1 | 1.1 | 0 |

10 | 2.9 | 4.9 | 2 |

11 | - | 6.3 | - |

Notice that the "control" data for Subject #11 is missing. It is not possible to compute a difference for this person, so their data will be omitted from our analysis. For this analysis, we will use the remaining $ n=10 $ observations.

You may find it easier to copy and paste the data from the following table. The last row has been omitted.

Increase in hours of sleep |
---|

0.7 |

-1.6 |

-0.2 |

-1.2 |

-0.1 |

3.4 |

3.7 |

0.8 |

0 |

2 |

**Give the relevant summary statistics**

- 44. Report the mean, standard deviation, and sample size for the differences.

Summary Statistics: | |
---|---|

Mean: | $ \bar d = 0.75 $ hours |

Standard Deviation: | $ s_d = 1.79 $ hours |

Sample Size: | $ n = 10 $ |

**Make an appropriate graph to illustrate the data**

- 45. Create a histogram of the differences in the hours of sleep.

- Here is a histogram of the data with 6 bins:

- 46.
**Verify the requirements have been met.**

- a. We assume the subjects represent a random sample from the population.

- b. A Q-Q plot of the differences indicates that is reasonable to conclude that the data are normally distributed even though the sample size is small:

- The requirements for creating the confidence interval seem to be satisfied.

- 47.
**Find the confidence interval**. Use the 90% level of confidence.

$ (-0.287, 1.787) $

- 48.
**Present your observations in an English sentence, relating the result to the context of the problem**

- We are 90% confident that the true mean difference in the amount of time people sleep by taking this drug compared to not taking the drug is between $-0.287$ hours and $1.787$ hours.
- Notice that 0 is in the confidence interval. This suggests that 0 is a plausible value for the mean difference in the times. In other words, the drug does not seem to affect the amount of time people sleep. L-Hyoscyamine is not an effective sleep aid--at least at these dosage levels.

## 5 Summary

- The key characteristic of
**dependent samples**(or**matched pairs**) is that knowing which subjects will be in group 1 determines which subjects will be in group 2.

- We use slightly different variables when conducting inference using dependent samples:

- Group 1 values: $x_1$ Group 2 values: $x_2$ Differences: $d$ Population mean: $\mu_d$ Sample mean: $\bar d$ Sample standard deviation: $s_d$

- When conducting hypothesis tests using dependent samples, the null hypothesis is always $\mu_d=0$, indicating that there is no change between the first population and the second population. The alternative hypothesis can be left-tailed ($<$), right-tailed($>$), or two-tailed($\ne$).

Previous Reading: Lesson 11: Inference for One Mean: Sigma Unknown |
This Reading: Lesson 12: Inference for Two Means: Paired Data |
Next Reading: Lesson 13: Inference for Two Means: Independent Samples |