# Lesson 14: Inference for Several Means (ANOVA)

These optional videos discuss the contents of this lesson.

## 1 Lesson Outcomes

By the end of this lesson, you should be able to:

• Hypothesis Testing for several means (ANOVA):
• State the null and alternative hypothesis.
• Calculate the test-statistic, degrees of freedom and p-value of the hypothesis test.
• Assess the statistical significance by comparing the p-value to the α-level.
• Check the requirements for the hypothesis test.
• Show the appropriate connections between the numerical and graphical summaries that support the hypothesis test.
• Draw a correct conclusion for the hypothesis test.

## 2 The Effects of Gratitude

 $~ ~$ President Gordon B. Hinckley said, "My plea is that we stop seeking out the storms and enjoy more fully the sunlight. I am suggesting that as we go through life, we 'accentuate the positive.' I am asking that we look a little deeper for the good, that we still our voices of insult and sarcasm, that we more generously compliment and endorse virtue and effort" (Standing for Something, 2000, p.101).

Summarize the relevant background information

Robert Emmons and Michael McCullough investigated the effects of gratitude on people's perception of life as a whole. In a study of $n=192$ undergraduates, the people were randomly assigned to one of three groups.

• Group 1 (Gratitude): The participants in this group were asked to record five things each week for which they were grateful or thankful.
• Group 2 (Hassles): The volunteers in this group recorded five irritants that had occurred to them in the previous week.
• Group 3 (Events): The people in the events group recorded five things that occurred in the past week that had an impact on them.

In addition to the weekly record of the five things they recorded their level of satisfaction with life in general. (Higher values are more favorable.) Reports were collected for nine weeks, and the overall level of satisfaction with life as a whole was recorded for each individual. The researchers wanted to determine if there was a difference in the perception of life as a whole between the subjects assigned to each of the three groups. Stated differently, they wanted to determine if expressing gratitude affects a person's view of life in general.

Here is an excerpt of data representing the results of this study:

Higher values indicate a greater level of satisfaction with life as a whole.

How might we analyze these data? One possible method would be to conduct separate t-tests for all the possible pairs of groups in the study. If we did this, we would need to conduct a separate t-test to compare groups 1 & 2, 1 & 3 and 2 & 3. If the probability of committing a Type I error is $\alpha = .05$ on each of these tests, then the probability that we would commit a Type I error on at least one of the tests is much greater than 0.05. We need a hypothesis test that we can use to compare all the groups at once. The procedure that allows us to do this is called Analysis of Variance (ANOVA).

## 3 Analysis of Variance (ANOVA)

ANOVA is a test for equality of several means. It allows us to compare the means for several groups--in one hypothesis test. It might sound intimidating, but ANOVA is simply a way to analyze several means at once. It is based on a comparison of the spread of the data within each of the groups compared to the spread of the means of the groups.

In an ANOVA test, the null hypothesis is typically expressed in words: $$H_0: \text{All the means are equal.}$$ The alternative hypothesis is given as: $$H_a: \text{One or more of the means differs from the others.}$$

If the means differ from each other in comparison to the variability in each group, then we conclude that the means are not all equal. If the means do not differ by much (when compared to the spread of the data in each group) then we do not reject the hypothesis that all the means are equal.

We will use the level of significance, $\alpha$, and the $P$-value just as we have in the other hypothesis tests.

### 3.1 $F$-distribution

The test statistic in ANOVA follows an $F$-distribution. This is the first time we have encountered this distribution. In previous tests, we have used the test statistics $z$ and $t$. For the ANOVA test, we use the $F$-statistic.

Here is a brief summary of the characteristics of the $F$-distribution:

• It is right skewed.
• The values of $F$ are never negative.
• The $P$-value for the ANOVA test is the area in the right tail. We will never divide the area in the tail.

### 3.2 Requirements of ANOVA

There are two requirements of ANOVA that must be checked:

• The data are normally distributed in each group.
We check this by creating a Q-Q plot for each group (separately). If the points in the Q-Q plot do not show distinct curvature, it is reasonable to conclude that the data are normally distributed.
This requirement is checked by examining the variances. The rule we will use is: if the largest variance is less than or equal to four times the smallest variance, then we will conclude that the variances are equal.

If done by hand, the calculations for one simple ANOVA problem can easily require an hour of hard work. We will use software to do these calculations quickly and accurately.

The variance is the square of the standard deviation. The sample variance is denoted by the symbol $s^2$.

### 3.3 How to Conduct an ANOVA Test

Excel Instructions
To conduct a test for several means (ANOVA) in Excel, do the following:
Open the file QuantitativeInferentialProcedures.xls and do the following:
• Click on the tab labeled "ANOVA"
• Paste the data from each group into their own column
• Paste the data from the first group in the appropriate part of Column A
• Paste the data for the second group in the designated part of Column B
• Continue copy and pasting each group in the appropriate columns
• The test statistic, $F$, is given in cell L15
• The degrees of freedom are presented in cells J15 and J16
• The $P$-value is reported in cell M15
• The "Descriptive Statistics" section of the output gives the sample size, mean and standard deviation for each of the groups in your sample.
• In the ANOVA table, you will find the test statistic ($F$), the $P$-value (Sig.), and the degrees of freedom (df) for the $F$ statistic. Note that there are two numbers specifying the degrees of freedom. These are given as the between groups and within groups df, respectively. Do not worry about the total df. This number is the sum of the other two.
To check your requirements, do the following:
• To determine if the data are normally distributed, we will make a Q-Q plot for each group separately.
• Using QuantitativeDescriptiveStatistics.xls, under the Q-Q plot tab you can paste in each of the three groups of data separately, to test for normality.
• To determine if the population variances are equal, we will use a very simple check:

• If the largest variance is at least four times as big as the smallest variance, we will conclude that the population variances are not equal. (Remember, the variance is the square of the standard deviation.)
• If the largest variance is not four times as big as the smallest variance, we will conclude that the population variances are close enough that we can assume the variances are all equal.

## 4 Worked Example: Gratitude

We will conduct a hypothesis test to determine if the mean responses of the individuals in the three groups differ.

State the null and alternative hypotheses and the level of significance \begin{align} H_0: & ~ \textrm{All the means are equal} \\ H_a: & ~ \textrm{At least one of the means differs} \end{align}

We will use the $\alpha = 0.05$ level of significance.

Describe the data collection procedures

The students were randomly assigned to one of the three treatments. They wrote in a weekly journal, according to their group assignment. At the end of the semester, they completed a questionnaire that asked about their attitude toward life. The responses on the survey were coded into a number, where higher numbers represent a more positive outlook.

Give the relevant summary statistics

Excel Instructions
Follow these instructions to apply the ANOVA procedure using Excel:
Data representative of the values reported by Emmons and McCullough are given in the file Gratitude-Stacked.xls. The data is divided up into three columns which represent Grateful, Hassels, and Events.
The summary statistics can be obtained using the QuantitativeInferentialProcedures.xls. Paste the three columns of data into the appropriate areas of column A, B, and C. The summary statistics will appear in the table to the right labeled "Descriptive Statistics".
This yields the following output:

The summary statistics are presented in the following table:

Group N Mean Std.
Deviation
Grateful 64 5.050 0.9443
Hassles 63 4.675 0.8320
Events $~$ $~$ $~$ 65$~$ $~$ $~$ $~$ $~$ 4.660 $~$ $~$ $~$ 0.8483 $~$

Please do not blindly cut-and-paste computer output. It can include a lot of information that we do not use. Identify the relevant parts and only report those pieces of information. Present only those portions of the output that are relevant.

Since the largest variance ($s^2$) is not four times larger than the smallest variance, we can assume the variances are equal. The points in the Q-Q plots for each of the three groups (not shown) follow the line closely, suggesting that the data can be assumed to be normally distributed. We conclude that the requirements are satisfied and it is appropriate to use ANOVA.

• Make an appropriate graph to illustrate the data

Verify the requirements have been met

Q-Q plots indicate that it is reasonable to conclude that the data are normally distributed in each group. The largest variance (0.8917 from the Grateful group) is not four times the smallest (0.6922 from the Hassles group), so we conclude that the variances are equal for the three groups.

Give the test statistic and its value

This can be found in the output. Our test statistic, $F$, is: $$F = 4.075$$

State the degrees of freedom

There are 2 and 189 degrees of freedom.

The order in which these are stated is important. For an F-test, it is not the same to have 2 and 189 degrees of freedom as it is to have 189 and 2 degrees of freedom.

Mark the test statistic and $P$-value on a graph of the sampling distribution

Find the $P$-value and compare it to the level of significance

$$P\textrm{-value}=0.019 < 0.05 = \alpha$$

Since $P$-value$=0.019 < 0.05 = \alpha$, we reject the null hypothesis.

Present your conclusion in an English sentence, relating the result to the context of the problem

There is sufficient evidence to suggest that at least one of the three groups has a mean level of satisfaction with life that differs from the others. In short, the mean level of satisfaction with life in general is not the same for all three groups.

If we take a closer look, we see that the Hassles and Events groups had means that were fairly close together. However, the Grateful group appears to have a significantly higher mean level of satisfaction than the other two groups.

## 5 Worked Example: Soccer Shoes

Summarize the relevant background information

Nike, a company that makes sporting goods including shoes, funded a study to compare five soccer shoe designs. The objective of the research was to determine if there is a difference in the mean accuracy soccer players achieve using different Nike shoe designs.

State the null and alternative hypotheses and the level of significance \begin{align} H_0: & \textrm{All the means are equal} \\ H_a: & \textrm{At least one of the means differs} \end{align}

We will use the $\alpha = 0.10$ level of significance.

Describe the data collection procedures

As part of the research, they asked trained soccer players to kick a ball at a target. The target was placed 115 cm above the ground and at a distance of 10 m from the players. Using electronic equipment, the researchers recorded the distance from the center of the target to the point where the ball hit. The objective of the research was to assess if footwear could affect the accuracy of a soccer player.

The subjects wore five different soccer shoes and for one treatment they kicked the ball in stocking feet. Due to the proprietary nature of the data, the shoes are only labeled "A," "B," "C," "D," and "E" in the article. Data representing the results of this study are given in the file SoccerShoes.

Use the SoccerShoes data to answer the following questions.

1. Give the relevant summary statistics
Group N Mean Std. Deviation
A 20 33.10 5.230
B 20 32.70 5.430
C 20 29.45 5.216
D 20 32.00 7.291
E 20 32.20 5.569
Socks 20 35.25 6.382

2. Make an appropriate graph to illustrate the data
Here is a side-by-side boxplot to illustrate the data for the six groups all at once. (You may also want to make side-by-side histograms.)

3. Based on what you have seen so far, does it appear that there is a significant difference between the mean accuracy of the kicks using the six types of footwear?
Just by looking at the boxplots we can see that there is a difference, but we do not know if there is a statistically significant difference between the mean accuracy of the kicks using the six different types of footwear until we perform an hypothesis test for several means.

4. If we conducted independent samples $t$-tests comparing the six groups, how many tests would be conducted if we compared each of the groups against all the others?
We would need to conduct 15 tests! We need to use ANOVA. We should not do the comparisons using several different t-tests.

5. Verify the requirements have been met
Q-Q plots of the data for each of the six groups (separately) do not indicate a departure from normality.
The largest standard deviation (7.291) is not double the smallest (5.216).
The largest variance (53.159) is not four times the smallest (27.207).

6. Give the test statistic and its value
$F$ statistic = 2.020

7. State the degrees of freedom
There are 5 and 114 degrees of freedom

8. Mark the test statistic and $P$-value on a graph of the sampling distribution

9. Find the $P$-value and compare it to the level of significance
$P\text{-value} =0.08094 < 0.1 = \alpha$

Since $P$-value$=0.08094 < 0.1 = \alpha$ we reject the null hypothesis.

11. Present your conclusion in an English sentence, relating the result to the context of the problem
There is sufficient evidence to suggest that there is a difference in the mean accuracy of the kicks for the various types of footwear.

## 6 Summary

Remember...
• ANOVA is used to compare the means for several groups. The hypotheses for the test are always:

\begin{align} H_0: & ~ \textrm{All the means are equal} \\ H_a: & ~ \textrm{At least one of the means differs} \end{align}

• For ANOVA testing we use an $F$-distribution, which is right-skewed. The $P$-value of an ANOVA test is always the area to the right of the $F$-statistic.
• We can conduct ANOVA testing when the following three requirements are satisfied:
1. The data come from a simple random sample.
2. The data are normally distributed within each group.
• This is satisfied when Q-Q Plots for the data in each group roughly follow a straight line.
3. The variance is constant.
• This is satisfied when the largest variance is not more than four times the smallest variance.