Lesson 15: Review for Exam 2

From BYU-I Statistics Text
(Redirected from Lesson 15)
Jump to: navigation, search

You are strongly encouraged to review the readings, homework exercises, and other activities from Units 1 and 2 as you prepare for the exam. In particular, you should go over the Review for Exam 1. Use the Index to review definitions of important terms.

1 Lesson Outcomes

The expectation on the exam is the following outcomes. You should be able to do:

  • All of the Outcomes from Lesson 08 (Unit 1)
  • Confidence Intervals:
    • Calculate and interpret a confidence interval given a confidence level and a given parameter.
    • Explain how the margin of error changes with the sample size and the level of confidence.
    • Identify a point estimate and margin of error for a given confidence interval.
    • Show the appropriate connections between the numerical and graphical summaries that support a confidence interval.
    • Check the requirements for a confidence interval.
    • Determining correct confidence interval for a given scenario
  • Calculate a confidence interval for
    • A single mean with σ known.
    • A single mean with σ unknown.
    • The mean of differences with dependent samples.
    • Difference of two means of independent samples.
  • Hypothesis Testing
    • State the null and alternative hypothesis for the chosen test.
    • Calculate the test-statistic, degrees of freedom and p-value of the hypothesis test.
    • Assess the statistical significance by comparing the p-value to the α-level.
    • Check the requirements for the hypothesis test.
    • Show the appropriate connections between the numerical and graphical summaries that support a hypothesis test.
    • Draw a correct conclusion for a hypothesis test.
    • Interpret a Type I and II error.
    • Determining correct hypothesis test for a given scenario.
  • Conduct a Hypothesis Test for
    • A single mean with σ known.
    • A single mean with σ unknown.
    • Difference of two means with dependent samples.
    • The mean of differences with independent samples.
    • Several means (ANOVA).



2 Lesson Summaries

Click on the link at right for a review of the summaries from each lesson.

Here are the summaries for each lesson in unit 2. Reviewing these key points from each lesson will help you in your preparation for the exam.

Lesson 09 Recap
  • The null hypothesis ($H_0$) is the foundational assumption about a population and represents the status quo. It is a statement of equality ($=$). The alternative hypothesis ($H_a$) is a different assumption about a population and is a statement of inequality ($<$, $>$, or $\ne$). Using a hypothesis test, we determine whether it is more likely that the null hypothesis or the alternative hypothesis is true.
  • The $P$-value is the probability of getting a test statistic at least as extreme as the one you got, assuming $H_0$ is true. A $P$-value is calculated by finding the area under the normal distribution curve that is more extreme (farther away from the mean) than the z-score. The alternative hypothesis tells us whether we look at both tails or only one.
  • The level of significance ($\alpha$) is the standard for determining whether or not the null hypothesis should be rejected. Typical values for $\alpha$ are $0.05$, $0.10$, and $0.01$. If the $P$-value is less than $\alpha$ we reject the null. If the $P$-value is not less than $\alpha$ we fail to reject the null.
  • A Type I error is committed when we reject a null hypothesis that is, in reality, true. A Type II error is committed when we fail to reject a null hypothesis that is, in reality, not true. The value of $\alpha$ is the probability of committing a Type I error.


Lesson 10 Recap
  • The margin of error gives an estimate of the variability of responses. It is calculated as $\displaystyle{m=z^*\frac{\sigma}{\sqrt{n}}}$ where $z^*$ represents a calculated z-score corresponding to a particular confidence level.
  • A confidence interval is an interval estimator used to give a range of plausible values for a parameter. The width of a confidence interval depends on the chosen confidence level (and its corresponding value of $z^*$) as well as the sample size ($n$). This is the equation for calculating confidence intervals:

$$\displaystyle{\left(\bar x-z^*\frac{\sigma}{\sqrt{n}},~\bar x+z^*\frac{\sigma}{\sqrt{n}}\right)}$$

  • The sample size formula allows us to estimate the number of observations required to obtain a specific margin of error. $\displaystyle{n=\left(\frac{z^*\sigma}{m}\right)^2}$


Lesson 11 Recap
  • In practice we rarely know the true standard deviation $\sigma$ and will therefore be unable to calculate a z-score. Student's t-distribution gives us a new test statistic, $t$, that is calculated using the sample standard deviation ($s$) instead.

$$ \displaystyle{ t = \frac {\bar x - \mu} {s / \sqrt{n}} } $$

  • The $t$-distribution is similar to a normal distribution in that it is bell-shaped and symmetrical, but the exact shape of the $t$-distribution depends on the degrees of freedom ($df$).

$$df=n-1$$

  • You will use Excel to carry out hypothesis testing and create confidence intervals involving $t$-distributions.


Lesson 12 Recap
  • The key characteristic of dependent samples (or matched pairs) is that knowing which subjects will be in group 1 determines which subjects will be in group 2.
  • We use slightly different variables when conducting inference using dependent samples:
Group 1 values: $x_1$  Group 2 values: $x_2$  Differences: $d$  Population mean: $\mu_d$  Sample mean: $\bar d$  Sample standard deviation: $s_d$
  • When conducting hypothesis tests using dependent samples, the null hypothesis is always $\mu_d=0$, indicating that there is no change between the first population and the second population. The alternative hypothesis can be left-tailed ($<$), right-tailed($>$), or two-tailed($\ne$).


Lesson 13 Recap
  • In contrast to dependent samples, two samples are independent if knowing which subjects are in group 1 tells you nothing about which subjects will be in group 2. With independent samples, there is no pairing between the groups.
  • When conducting inference using independent samples we use $\bar x_1$, $s_1$, and $n_1$ for the mean, standard deviation, and sample size, respectively, of group 1. We use the symbols $\bar x_2$, $s_2$, and $n_2$ for group 2.
  • When working with independent samples it is important to graphically illustrate each sample separately. Combining the groups to create a single graph is not appropriate.
  • When conducting hypothesis tests using independent samples, the null hypothesis is always $\mu_1=\mu_2$, indicating that there is no difference between the two populations. The alternative hypothesis can be left-tailed ($<$), right-tailed($>$), or two-tailed($\ne$).
  • Whenever zero is contained in the confidence interval of the difference of the true means we conclude that there is no significant difference between the two populations.


Lesson 14 Recap
  • ANOVA is used to compare the means for several groups. The hypotheses for the test are always:

$$ \begin{align} H_0: & ~ \textrm{All the means are equal} \\ H_a: & ~ \textrm{At least one of the means differs} \end{align} $$

  • For ANOVA testing we use an $F$-distribution, which is right-skewed. The $P$-value of an ANOVA test is always the area to the right of the $F$-statistic.
  • We can conduct ANOVA testing when the following three requirements are satisfied:
1. The data come from a simple random sample.
2. The data are normally distributed within each group.
  • This is satisfied when Q-Q Plots for the data in each group roughly follow a straight line.
3. The variance is constant.
  • This is satisfied when the largest variance is not more than four times the smallest variance.


3 Navigation

Previous Reading:
Lesson 14:
Inference for Several Means (ANOVA)
                   This Reading:
Lesson 15:
Review for Exam 2
                   Next Reading:
Lesson 16:
Describing Categorical Data: Proportions; Sampling Distribution of a Sample Proportion