Whenever we discuss prediction models, it’s important to understand prediction errors, i.e. bias and variance. A proper understanding of these concepts would help us not only to build accurate models but also to avoid the mistake of over-fitting and under-fitting.

We quickly explain the two concept using the following illustration.

Bias vs Variance (source: Quora)

Suppose that a man is trying to shoot in the bull’s eye. His shooting skill can be considered the prediction model. The shooting results are the model’s prediction.

What is bias?

  • Bias shows the difference between the prediction (average) and the correct value.
  • If the shoot results are far-away from the bull’s eye, the bias is high and likewise.

Some causes of high bias:

  • Oversimplifies the model
  • Not taking into account all the key features
  • Not enough data
  • Wrong model selection

What is variance?

  • Variance shows the spread of our data. 
  • Or the variability of model prediction for a given data point or a value

Some causes on high variance:

  • Noisy training dataset
  • Sparse dataset
  • Algorithm lack of generalization to capture the underlying patterns

Overfitting and Underfitting

Fitting examples (Source: Medium)
  • Under-fitting: often high bias + low variance
  • Over-fitting: often low bias + high variance, good at training dataset, bad at testing dataset
Follow me

Tung Nguyen

PhD/Researcher/Programmer at Up Education - YooBee Colleges
I received the B.Eng. Degree in telecommunication from Shanghai University, MSc of the same major from Paris-Sud University and a PhD degree in Computer Science from Auckland University of Technology. My research interests include machine learning, game theory, computational trust, multi-agent systems and software engineering.
Follow me

Latest posts by Tung Nguyen (see all)

Login with your social ID or Credentials