# Help:Lesson 20 Print

The following questions are intended to help you judge your preparation for this exam. Carefully work through the problems.

**These questions are repeated on the preparation quiz for this lesson.**

This is not designed to be a comprehensive review. There may be items on the exam that are not covered in this review. Similarly, there may be items in this review that are not tested on this exam. You are strongly encouraged to review the readings, homework exercises, and other activities from Units 1-3 as you prepare for the exam. In particular, you should go over the Review for Exam 1 and the Review for Exam 2. Use the Index to review definitions of important terms.

## 1 Lesson Summaries

Click on the link at right for a review of the summaries from each lesson.

**Here are the summaries for each lesson in unit 3. Reviewing these key points from each lesson will help you in your preparation for the exam.**

**Pie charts**are used when you want to represent the observations as part of a whole, where each slice (sector) of the pie chart represents a proportion or percentage of the whole.

**Bar charts**present the same information as pie charts and are used when our data represent counts. A**Pareto chart**is a bar chart where the height of the bars is presented in descending order.

- $\hat p$ is a point estimator for true proportion $p$. $\displaystyle{\hat p = \frac{x}{n}}$

- The sampling distribution of $\hat p$ has a mean of $p$ and a standard deviation of $\displaystyle{\sqrt{\frac{p\cdot(1-p)}{n}}}$

- If $np \ge 10$ and $n(1-p) \ge 10$, you can conduct
**probability calculations**using the Normal Probability Applet. $\displaystyle {z = \frac{\textrm{value} - \textrm{mean}}{\textrm{standard deviation}} = \frac{\hat p - p}{\sqrt{\frac{p \cdot (1-p)}{n}}}}$

- The
**estimator**of $p$ is $\hat p$. $\displaystyle{ \hat p = \frac {x}{n}}$ and is used for both confidence intervals and hypothesis testing.

- You will use the Excel spreadsheet

- The requirements for a confidence interval are $n \hat p \ge 10$ and $n(1-\hat p) \ge 10$. The requirements for hypothesis tests involving one proportion are $np\ge10$ and $n(1-p)\ge10$.

- We can determine the sample size we need to obtain a desired margin of error using the formula $\displaystyle{ n=\left(\frac{z^*}{m}\right)^2 p^*(1-p^*)}$ where $p^*$ is a
**prior estimate**of $p$. If no prior estimate is available, the formula $\displaystyle{ \left(\frac{z^*}{2m}\right)^2}$ is used.

- When conducting hypothesis tests using two proportions, the null hypothesis is always $p_1=p_2$, indicating that there is no difference between the two proportions. The alternative hypothesis can be left-tailed ($<$), right-tailed($>$), or two-tailed($\ne$).

- For a hypothesis test and confidence interval of two proportions, we use the following symbols:

$$ \begin{array}{lcl} \text{Sample proportion for group 1:} & \hat p_1 = \displaystyle{\frac{x_1}{n_1}} \\ \text{Sample proportion for group 2:} & \hat p_2 = \displaystyle{\frac{x_2}{n_2}} \end{array} $$

- For a hypothesis test only, we use the following symbols:

$$ \begin{array}{lcl} \text{Overall sample proportion:} & \hat p = \displaystyle{\frac{x_1+x_2}{n_1+n_2}} \end{array} $$

- Whenever zero is contained in the confidence interval of the difference of the true proportions we conclude that there is no significant difference between the two proportions.

- You will use the Excel spreadsheet

- The
**$\chi^2$ hypothesis test**is a test of independence between two variables. These variables are either associated or they are not. Therefore, the null and alternative hypotheses are the same for every test:

$$ \begin{array}{1cl} H_0: & \text{The (first variable) and the (second variable) are independent.} \\ H_a: & \text{The (first variable) and the (second variable) are not independent.} \end{array} $$

- The
**degrees of freedom ($df$)**for a $\chi^2$ test of independence are calculated using the formula $df=(\text{number of rows}-1)(\text{number of columns}-1)$

- In our hypothesis testing for $\chi^2$ we never conclude that two variables are
*dependent*. Instead, we say that two variables are*not independent*.

## 2 Review Questions

**Questions 1 through 5: Decide which hypothesis test to use.** Here is a list of hypothesis tests we have studied so far this semester. For each question identify the one hypothesis test that is most appropriate to the given situation. You may use a hypothesis test once, more than once, or not at all.

- a. One sample z-test
- b. One sample t-test
- c. Paired-samples t-test
- d. Independent sample t-test
- e. ANOVA
- f. Test of one proportion
- g. Test of two proportions
- h. Chi-Squared test of independence

1. In an article in the Journal of Small Business Management successful start-up businesses in the United States and Korea were compared. One set of data compared educational level (high school, undergraduate degree, master’s degree, doctoral degree) of people who managed successful start-up companies in the United States and Korea. You want to determine if education level differs between managers of successful start-up companies differs between these two countries. Which hypothesis test would be most appropriate for this analysis?

2. A human resources manager reported data from a recent involuntary Reduction in Force (RIF) at her company. You are an attorney and want to determine if age discrimination was a factor (it is illegal to discriminate against employees because of age). The company reported the number of employees in two groups: 40 years old or younger, and over 40 years old. They also reported the number of employees in each group who were terminated. You want to determine if both age groups were treated equally. Which hypothesis test would be most appropriate for this analysis?

3. A survey was conducted by a group of state lotteries. A random sample of 2406 adults completed the survey. A total of 248 were classified as “heavy” players. Of these, 152 were male. You want to determine if the proportion of male “heavy” lottery players is different than the proportion of males in the population, which is 48.5%. Which hypothesis test would be most appropriate for this analysis?

4. A student project compared the effectiveness of two different combination locks. One of the locks turned clockwise first and the other lock turned counterclockwise first. They asked 25 students to participate in the study. Each student was given the combination to each lock and asked to open the locks. The time it took them to open each lock was recorded. They want to determine if one of the locks is easier to open. Which hypothesis test would be most appropriate for this analysis?

5. Weight gain during pregnancy of the mother is an important indicator of infant health. A simple random sample of pregnant women on Egypt, Kenya, and Mexico was used to determine if weight gain during pregnancy differed in these three countries. Which hypothesis test would be most appropriate for this analysis?

**Questions 6 through 9: Decide which confidence interval to use.** Here is a list of confidence intervals we have studied so far this semester. For each question identify the one confidence interval that is most appropriate for the given situation. You may use a confidence interval once, more than once, or not at all.

- a. One sample z-confidence interval
- b. One sample t-confidence interval
- c. Paired-samples t-confidence interval
- d. Independent sample t-confidence interval
- e. "+4" confidence interval for one proportion
- f. "+4" confidence interval for two proportions

6. A bank employs two appraisers. When approving borrowers for mortgages, it is imperative that the appraisers value the same types of properties consistently. To make sure this is the case, the bank evaluates six properties that both appraisers have recently valued. Which confidence interval would be most appropriate for this study?

7. In a Wall Street Journal article on satisfaction with career paths, the percentage of psychology majors reporting they were “satisfied” or “very satisfied” with their career path was reported. The same data was also reported for accounting majors. You decide to construct a 95% confidence interval to see if the observed difference is significant. Which confidence interval would be most appropriate for this study?

8. O’Hare International Airport in Chicago has a reputation for having a large proportion of its flights being late. You design a study to see if this reputation is deserved. You find that the average on-time rate for all international airports in the US is 70%. You collect data and determine the on-time rate for O’Hare. You decide to construct a confidence interval to compare O’Hare’s on-time rate to the national average. Which confidence interval would be most appropriate for this study?

9. DoubleStuf Oreo cookies are supposed to have twice the filling of regular Oreo cookies. You and some friends decide you want to know if that is a true assertion by the company who makes them. You take a sample of 55 DoubleStuf Oreo cookies and measure the amount of filling in each one. You need to construct a confidence interval to estimate the true mean filling amount of DoubleStuf Oreos in order to compare it to the filling amount found in regular Oreos. Which confidence interval would be most appropriate for this study?

10. Which one of the following best defines the notion of the significance level of a hypothesis test?

- a. The probability of rejecting $H_o$, whether it's true or not
- b. The probability of observing a sample statistic more extreme than the one actually obtained, assuming the null hypothesis is true
- c. The probability of the type I error
- d. The probability of the type II error

11. Which one of the following best defines the notion of the $P$-value of a hypothesis test?

- a. The probability of rejecting $H_o$, whether it's true or not
- b. The probability of observing a sample statistic more extreme than the one actually obtained, assuming the null hypothesis is true
- c. The probability of the type I error
- d. The probability of the type II error

12. Suppose you create a 95% confidence interval for a mean, and get (10, 20). You've been told to report this by saying something similar to, “We are 95% confident that the true mean is between 10 and 20." Exactly what does this mean?

- a. 95% of the data are between 10 and 20.
- b. 95% of the sample means are between 10 and 20.
- c. There is a 95% chance that the true mean is between 10 and 20.
- d. 95% of all 95% confidence intervals actually contain the true mean.

**Questions 13 through 15: Use the following information.** You take a simple random sample of 100 adults from a town in the Western United States to determine the proportion of adults in the town who invest in the stock market. Assume the unknown population proportion or percentage of people in town who invest in the stock market is $p=0.30$ (or 30%).

13. What is the mean of the distribution of the sample proportions?

- a. 30
- b. 70
- c. 0.70
- d. 0.30

14. What is the standard deviation of the distribution of the sample proportions?

- a. 0.004
- b. 0.046
- c. 0.458
- d. 4.583

15. What is the probability that your random sample of 100 adults will have a sample proportion less that 0.25?

- a. 0.138
- b. 0.124
- c. 0.876
- d. 0.862

**Questions 16 through 20: Use the following information.** Accupril is meant to control hypertension. In clinical trials of Accupril, 2142 subjects were divided into two groups. The 1563 subjects in the experimental group received Accupril. The 579 subjects in the control group received a placebo. Of the 1563 in the experimental group, 61 experienced dizziness as a side effect. Of the 579 subjects in the control group, 15 experienced dizziness as a side effect.

16. Let $p_1$ be the true proportion of people who experience dizziness while taking Accupril. Let $p_2$ be the true proportion of people who experience dizziness but do not take Accupril. Create a 95% confidence interval for $p_1 - p_2$.

- a. (0.006, 0.092)
- b. (-0.06, 0.92)
- c. (-0.004, 0.029)
- d. (-0.04, 0.29)

Perform a hypothesis test to see if the proportion of experimental group subjects who experience dizziness is different than the proportion of control group subjects who do. Let $p_1$ be the true proportion of people who experience dizziness while taking Accupril. Let $p_2$ be the true proportion of people who experience dizziness but do not take Accupril. Use a level of significance of $\alpha = 0.05$.

17. Which of the following pairs of hypotheses is the most appropriate for addressing this question?

- a. $H_o:~p_1=p_2$ $H_a:~p_1<p_2$
- b. $H_o:~p_1=p_2$ $H_a:~p_1\ne p_2$
- c. $H_o:~p_1=p_2$ $H_a:~p_1>p_2$
- d. $H_o:~p_1<p_2$ $H_a:~p_1=p_2$
- e. $H_o:~p_1 \ne p_2$ $H_a:~p_1=p_2$
- f. $H_o:~p_1>p_2$ $H_a:~p_1=p_2$

18. The value of your test statistic is:

- a. -1.361
- b. 0.897
- c. 1.923
- d. 1.458

19. The $P$-value of your test is:

- a. 0.045
- b. 0.014
- c. 0.072
- d. 0.145

20. Is there sufficient evidence to conclude that the true proportion of people who experience dizziness while taking Accupril is different than the true proportion of people who experience dizziness while not taking Accupril?

- a. Yes. I rejected $H_o$.
- b. Yes. I failed to reject $H_o$.
- c. Yes. I accepted $H_a$.
- d. No. I rejected $H_o$.
- e. No. I failed to reject $H_o$.
- f. No. I failed to accept $H_a$.

**Questions 21 through 24: Use the following information and table.**

A survey was conducted of 1279 randomly selected adults aged 18 and older. They were asked “Are you a morning person or a night person?”

The hypotheses for this study are:

$$ \begin{array}{rl} H_o: & \text{Being a morning or evening person is independent of age} \\ H_a: & \text{Being a morning or evening person is not independent of age} \\ \end{array} $$

The results of the survey are given here:

Preference | 18-29 | 30-49 | 50-64 | 65+ |
---|---|---|---|---|

Morning Person | 97 | 177 | 210 | 210 |

Evening Person | 131 | 167 | 200 | 190 |

Conduct a test of independence. Use a level of significance of $\alpha=0.05$

21. Calculate the test statistic for this hypothesis test. Assume the requirements for the test are satisfied.

- a. 6.580
- b. 0.658
- c. 9.760
- d. 0.097

22. Calculate the $P$-value for this hypothesis test. Assume the requirements for the test are satisfied.

- a. 8.660
- b. 0.009
- c. 0.866
- d. 0.087

23. Should you reject $H_o$ or not? Explain.

- a. Yes. The $P$-value is less than 0.05.
- b. Yes. The $P$-value is greater than 0.05.
- c. Yes. Looking at the data we can see that the age is a factor in determining if you are a morning or a night person.
- d. No. The $P$-value is less than 0.05.
- e. No. The $P$-value is greater than 0.05.
- f. No. Young people are more likely to be a night person.

24. Do you have sufficient evidence to conclude that age makes a difference in whether a person is a morning or night person? Why or why not?

- a. Yes. The table makes this clear.
- b. Yes. I rejected $H_o$.
- c. Yes. I failed to reject $H_o$.
- d. No. The difference in the data in the table is entirely due to chance.
- e. No. I rejected $H_o$.
- f. No. I failed to reject $H_o$.

**Questions 25 and 31: Use the following information to answer each question.** A recent book noted that only 20% of all investment managers outperform the Dow Jones Industrial Average over a five-year period. A random sample of 200 investment managers that had graduated from one of the top ten business programs in the country were followed over a five-year period. Fifty of these outperformed the Dow Jones Industrial Average. Let $p$ be the true proportion of investment managers who graduated from one of the top ten business programs who outperformed the Dow Jones over a five-year period.

25. Based on the results of the sample, a 95% confidence interval for $p$ is:

- a. (1.95, 3.15)
- b. (0.0195, 0 .0315)
- c. (0.195, 0.315)
- d. (0.028, 0.031)
- e. We can assert that $p$ = 0.20 with 100% confidence, because only 20% of investment managers outperform the standard indexes.

26. Suppose you had been in charge of designing the study. What sample size would be needed to construct a margin of error of 2% with 95% confidence? Use the prior estimate of $p^* = 0.2$ for this estimate.

- a. $n=2401$
- b. $n=1537$
- c. $n=16$
- d. $n=1801$
- e. $n>30$

Suppose you wish to see if there is evidence that graduates of one of the top ten business programs performs better than other investment managers. Conduct a hypothesis test. Use a level of significance of $\alpha=0.05$.

27. Which of the following pairs of hypotheses is the most appropriate for addressing this question?

- a. $H_o:~p=0.2$ $H_a:~p<0.2$
- b. $H_o:~p=0.2$ $H_a:~p\ne0.2$
- c. $H_o:~p=0.2$ $H_a:~p>0.2$
- d. $H_o:~p<0.2$ $H_a:~p=0.2$
- e. $H_o:~p\ne0.2$ $H_a:~p=0.2$
- f. $H_o:~p>0.2$ $H_a:~p=0.2$

28. How many measurements must you have in order to assure that $\hat p$ is normally distributed?

- a. $n\ge30$
- b. $n\ge5$
- c. $np\ge10$ and $n(1-p)\ge10$
- d. $np\ge5$ and $n(1-p)\ge5$

29. The value of your test statistic is:

- a. 1.768
- b. 0.039
- c. 1.923
- d. 0.077

30. The $P$-value of your test is:

- a. 1.768
- b. 0.039
- c. 1.923
- d. 0.077

31. Is there sufficient evidence to conclude that graduates from the top ten business programs perform better than other investment managers?

- a. Yes. I rejected $H_o$.
- b. Yes. I failed to reject $H_o$.
- c. Yes. I accepted $H_a$.
- d. No. I rejected $H_o$.
- e. No. I failed to reject $H_o$.
- f. No. I failed to accept $H_a$.