Type I and Type II Errors

When drawing an inference (from a sample statistic, about a population parameter), there can be two types of errors: Type I and Type II.

Type I error, also known as error of the first kind, occurs when the null hypothesis is true, but is rejected.

Type II error, also known as the error of the second kind, occurs when the null hypothesis is false, but is accepted as true.

When we conduct a significance test, we first define the null hypothesis (H0) and the alternative hypothesis (Ha). This is done with reference to a population. The null hypothesis generally refers to a generally accepted assumption about the population parameter. The alternative hypothesis is the alternative to the null hypothesis. The objective of a hypothesis test is to reject the null hypothesis, which is to say that the alternative hypothesis is supported by the data. The conclusion of the hypothesis test will be that we either reject the null hypothesis or we fail to reject the null hypothesis.

To perform the test, we take a sample from the population, and using this sample we calculate the test statistic which is used to make a decision about whether the null hypothesis should be rejected or not. This test statistic is a function of the sample data.

We want to calculate the probability of getting this statistic, assuming out null hypothesis is true. If this value, also known as p-value, is below a certain threshold (significance level), then we reject the null hypothesis.

1p-value < α => Reject H0
2
3p-value >= α => Fail to Reject H0
4

If the significance level is 5%, then what we are saying is that if the p-value (probability of getting this statistic from a certain sample size, assuming null hypothesis is true) is less than the threshold of 5%, then it's reasonable to reject the null hypothesis.

However in reality, we may be wrong and this scenario may not really hold true. In such a situation we will see the Type I and Type II errors. This is possible because we are conducting out significance test based on a sample and not on the entire population.

The following table provides a clear view of the Type I and Type II errors.

Null Hypothesis	True	False
Rejected	Type I Error	Correct
Fail to Reject	Correct	Type II Error

There are four scenarios:

Scenario 1: In reality, the null hypothesis is true, but we reject it. This is called the Type I error, or False positive. The probability of getting a type I error is equal to our significance level (α)

Scenario 2: In reality, the null hypothesis is false, and through our test, we reject it. This is a correct conclusion of the hypothesis test.

Scenario 3: In reality, the null hypothesis is true, and we accept it. This also is a correct conclusion of the hypothesis test.

Scenario 2: In reality, the null hypothesis is false, and through our test, we fail to reject it. This is called the Type II error, or False negative.

As an example, let's say Company A produces electric switches, but 5% of them are defective. Company B claims that they produce fewer defective switches. The null and alternative hypothesis will be stated as follows:

**H0: p = 0.05 versus Ha: p<0.05**

To test the hypothesis the company may use a sample of 200 switches from company B to calculate out test statistic.

Company B will commit a Type I error, if it rejects the null hypothesis and concludes that they make fewer than 5% defective switches even though in reality they make 5% defective switches.

Company B will commit a Type II error, if it accepts the null hypothesis and concludes that they make 5% defective switches even though in reality they make fewer than 5% defective switches.

Learn

Resources

Type I and Type II Errors

Correlation and Covariance

Probability: Permutations and Combinations

Data Science for Finance Bundle

Topics