Type I and Type II Errors

When drawing an inference (from a sample statistic, about a population parameter), there can be two types of errors: Type I and Type II.

Type I error, also known as error of the first kind, occurs when the null hypothesis is true, but is rejected.

Type II error, also known as the error of the second kind, occurs when the null hypothesis is false, but is accepted as true.

When we conduct a significance test, we first define the null hypothesis (H0) and the alternative hypothesis (Ha). This is done with reference to a population. The null hypothesis generally refers to a generally accepted assumption about the population parameter. The alternative hypothesis is the alternative to the null hypothesis. The objective of a hypothesis test is to reject the null hypothesis, which is to say that the alternative hypothesis is supported by the data. The conclusion of the hypothesis test will be that we either reject the null hypothesis or we fail to reject the null hypothesis.

To perform the test, we take a sample from the population, and using this sample we calculate the test statistic which is used to make a decision about whether the null hypothesis should be rejected or not. This test statistic is a function of the sample data.

We want to calculate the probability of getting this statistic, assuming out null hypothesis is true. If this value, also known as p-value, is below a certain threshold (significance level), then we reject the null hypothesis.

p-value < α => Reject H0

p-value >= α => Fail to Reject H0

If the significance level is 5%, then what we are saying is that if the p-value (probability of getting this statistic from a certain sample size, assuming null hypothesis is true) is less than the threshold of 5%, then it's reasonable to reject the null hypothesis.

This content is for paid members only.

Join our membership for lifelong unlimited access to all our data science learning content and resources.