Hypothesis Testing

Return to Behavioral Research Methods

When testing a hypothesis, you are actually making an indirect comparison between the populations of your samples. Since samples are not perfectly representative of populations, the results of a hypothesis test is a statistically supported guess.

Levels of Significance
To make claims about a population based on testing samples, we have a given amount of possible error (chance of being wrong). Accepted values of possible error for behavioral research are 5% (liberal) or 1% (conservative).

We commonly say:

A hypothesis is tested at the .05 level of significance (or alpha equals .05) which means we are willing to tolerate 5% error.

Or

A hypothesis is tested at the .01 level of significance (or alpha equals .01) which means we are willing to tolerate 1% error.

If a hypothesis is significant at the .01 level of significance, it is also significant at the .05 level (this does not work the other way around).

Type I and Type II Error
Because we use samples instead of populations, we are never 100% sure of our results. This means there is always a bit of error. In hypothesis testing, there are two kinds of error: Type I and Type II.

The good news is, you cannot have both at the same time. The bad news is there is a chance of either one occuring.

Type I Error
Also known as Alpha Error, this is when you reject a true Null Hypothesis. Meaning you find something significant when it's really not significant.

The probability of making a Type I Error is determined in the decision making process because it is the level of significance (or alpha level).

When you make a correct rejection by rejecting a false Null Hypothesis, p = (1-beta) which is the probability (p) of being correct (also known as Power).

It is only possible to make a Type I Error if the Null Hypothesis is rejected.

Type II Error
Also known as Beta Error, this is when you fail to reject a false Null Hypothesis. You don't find significance when it should be significant.

The probability of making a Type II Error is the value of beta which is a function of two factors: the actual difference between the samples and the sample size.

When you make a correct decision by failing to reject a true Null Hypothesis, p = (1-alpha) which is the probability (p) of being correct (this is not  Power).

It is only possible to make a Type II Error if the Null Hypothesis is not rejected.

Controlling For Error
If you want to decrease the likelihood of making either a Type I or Type II Error, then increase the sample size.

You are more likely to reject the Null Hypothesis at the .05 than the .01 level of significance. This means you are more likely to have a Type I Error at the .05 than the .01 level of significance. Conversely, you are more likely to have a Type II Error at the .01 than the .05 level of signficance.

You are more likely to make a Type I Error with a one-tailed test because it's easier to reject the Null Hypothesis.

You are more likely to make a Type II Error with a two-tailed test because it's harder to reject the Null Hypothesis.

The greater the difference between the samples, the less likely you are to make a Type II Error. The magnitude of the difference depends on the sample size and type of data. Ideally, you want data with a wide range and a large sample size.

Null and Alternative Hypothesis
Hypothesis testing consists of two hypotheses: The Null Hypothesis and the Alternative (or Research) Hypothesis.

The Null Hypothesis (H0) is the default or general hypothesis. It takes the position that there is no difference or the difference is not in the expected direction.

The Alternative Hypothesis (H1) is the directed hypothesis. It proposes that there is a difference or that the difference is in the expected direction.


 * For example, a researcher wants to know if watching TV increases violence in schools. The Null Hypothesis is that watching TV does not increase violence in schools. The Alternative Hypothesis is that watching TV increases violence in schools.


 * Another example, a researcher thinks boys and girls have different types of toys. The Null Hypothesis is that boys and girls do not have different types of toys. The Alternative Hypothesis is that boys and girls do have different types of toys.

If the Null Hypothesis is rejected, your results are significant.

If the Null Hypothesis is not rejected, your results are not significant.

It is important to remember the Null Hypothesis is  never  'supported' it is only ever 'rejected' or 'not rejected.'

When you determine significance, your research question determines if you use a two-tailed or one-tailed test. Regardless of doing a one-tailed or two-tailed test, the calculations for hypothesis testing are the same.

Two-Tailed Test
For hypotheses of differences, use a two-tailed test.


 * For example, to find out if non-smokers have different lifespans than smokers, we'd use a two-tailed test.

So two-tailed tests are used to find out if two groups are different. It can also be used to compare two sets of data from the same group.


 * For example, we may want to know if a midterm was the same difficulty as the final exam in a class. The test scores of midterm can be compared to the test scores of the final exam to determine if they are significantly different.

This is just for determining if a difference is present when we don't care about the direction.

One-Tailed Test
For directional hypotheses, use a one-tailed test.


 * For example, to find out if non-smokers have longer lifespans than smokers, we'd use a one-tailed test.


 * Another example, to find out if non-smokers have a lower chance of having lung cancer than smokers, we'd use a one-tailed test.

One-tailed tests are used to find out if something is better/higher or worse/lower than something else. So not only do we want there to be a difference, we want it to have a direction.

Pretend you're testing out a medication. You wouldn't want it to just have different effects on patients, you want it to have better effects on patients!

Determining Significance
When determining significance, no matter which statistic you are using, all of the same rules apply. Each statistic has its own formulas to determine a Critical Ratio but the hypothesis testing and Critical Regions are still the same. To gain a basic understanding of how a hypothesis test works, the z ratio will be the example.

Using the z ratio (Critical Ratio) you can determine if a hypothesis is significant. If you have a computer program, obtaining the z ratio is easy but if you want to do it by hand... enjoy.

Calculating the Critical Ratio
There are 3 steps to calculating the Critical Ratio:


 * (1) Calculate the Standard Error of Means for both groups (or samples)Hypothesis_testing_standard_error_of_means.jpg.Hypothesis_testing_standard_difference_between_means.jpg


 * (2) Calculate the Standard Error of Difference Between Means.


 * (3) Calculate the Critical Ratio. This is the number used to determine if the hypothesis is significant or not.

Test the Null Hypothesis
Depending on whether you are doing a one-tailed or two-tailed test will determine the cutoff of your Critical Ratio. Once you find your cutoff, you can find out if the results are significant!

Remember how the Normal Curve represented the probability of being within so many standard deviations of the mean? Those probabilities are still the same but it's being used in a different way. When testing for significance, the Normal Curve is used to determine if the Critical Ratio is different from chance.

Meaning, it shows you if the difference between the groups wasn't just lucky, it's a real difference.

The cutoff determines your Critical Region which is the range of standard deviations away from the mean the Critical Ratio has to be in order to be significant. This region is the area of the normal curve where you want your Critical Ratio to fall.

One-Tailed Test For Significance
Determine the level of significance (the amount of error you are willing to tolerate). Since we are using the Normal Curve, we already know the probabilities of being in different regions. This makes it easy to compare that value to our Critical Ratio. Next, we determine the direction of the Alternative Hypothesis. Once we know these, we can find the Critical Region.

If your direction is negative:

For a .05 level of significance (5% chance of error), the Critical Ratio must be at or lower than -1.65. The area under the Normal Curve beyond -1.65 is the Critical Region. This area under the curve is equal to 5% of the total area (or probability).

For a .01 level of significance (1% chance of error), the Critical Ratio must be at or lower than - 2.33. The area under the Normal Curve beyond -2.33 is the Critical Region. This area under the curve is equal to 1% of the total area (or probability).

If your direction is positive:

For a .05 level of significance (5% chance of error), the Critical Ratio must be at or higher than +1.65. The area under the Normal Curve beyond +1.65 is the Critical Region. This area under the curve is equal to 5% of the total area (or probability).

For a .01 level of significance (1% chance of error), the Critical Ratio must be at or higher than +2.33. The area under the Normal Curve beyond +2.33 is the Critical Region. This area under the curve is equal to 1% of the total area (or probability).

Two-Tailed Test For Significance
Determine the level of significance (the amount of error you are willing to tolerate). Then you can find your Critical Region. Since this is two-tailed, the amount of error is divided between the two tails. That means for the .05 level of significance, we want .025 (or 2.5%) probability in each tail. For the .01 level of significance, we want .005 (or 0.5%) probability in each tail.

For a .05 level of significance (5% chance of error), the Critical Ratio must be at or exceed +-1.96. The area under the Normal Curve beyond +-1.96 is the Critical Region. This area is equal to 5% of the total area (or probability) with 2.5% in each tail.

For a .01 level of significance (1% chance of error), the Critical Ratio must be at or exceed +-2.58. The area under the Normal Curve beyond +-2.58 is the Critical Region. This area is equal to 1% of the total area (or probability) with 0.5% in each tail.

Reject or Fail to Reject the Null Hypothesis
If your Critical Ratio falls in the Critical Region, then you reject the Null Hypothesis.

If your Critical Ratio does not fall in the Critical Region, then you fail to reject the Null Hypothesis.