Type 1 and Type 2 Error
When you are testing hypotheses, you might encounter type 1 and type 2 errors. Identifying them and dealing with them is essential for setting up statistical testing scenarios. They also play a huge role in machine learning.
What is a Type 1 Error in Statistics?
When you reject the null hypothesis although it is true, you are committing a type 1 error.
Type 1 Error Example
Assume you are moving into a hotel room. The hotel staff is telling you that it is a safe area. Your null hypothesis is that the area is safe and that nobody will break into your room. But you are not convinced. As a precaution, you decide to not bring any valuables to your hotel room.
After your stay nobody has broken into your room. It would have been safe to bring valuables. So you have committed a type 1 error by rejecting the null hypothesis even though it was true.
What is a Type 2 Error in Statistics?
When you fail to reject the null hypothesis even though it is false, you are committing a type 2 error.
Type 2 Error Example
Since your last trip has been safe, you decide to believe that the area is safe this time and you bring your valuables. You accept the null hypothesis. On the second day, you find that all your valuables are gone. You have committed a type 2 error by failing to reject the null hypothesis even though it was false.
Considerations for test design
When designing your statistical tests, you ideally want to reduce your overall error rate. In many cases though, you can’t have it both ways. You have to give priority to reducing one type of error which often comes at the cost of accepting an increased rate of the other error. What type of error reduction to prioritize mainly depends on the situation.
If we go back to the example with the hotel room, most people would probably agree that committing a type 2 error is more serious. If you trust that the place is safe, you bring your diamond ring and it ends up being stolen, you have suffered a real loss. If you don’t trust that the place is safe (even though it is) and you leave your valuables at home, you might experience the inconvenience of not being able to wear your diamond ring for a few days. But the ring is still at home and you haven’t suffered a financial loss.
In this case, you would want to reduce the type 2 error rate even at the cost of spending a few more safe trips without your precious diamond ring (increased type one error rate).
Now imagine, you are a judge in court. You have to decide whether to convict a person accused of a crime. The evidence is not clear. Your H0 is that the person is innocent.
If you convict that person and he or she turns out to be innocent, you have committed a type one error.
If you let that person go and it turns out he or she was guilty, you have committed a type 2 error.
Most reasonable people would probably agree that putting an innocent person in jail is worse than letting a guilty person go. So you want to reduce type 2 errors and make sure you only convict people who are truly guilty, even at the expense of letting a few more guilty people go (type 1 error).