Created date
Jun 16, 2022 01:21 PM
Data Science
Machine Learning
Machine Learning


In supervised learning, the prediction error has the reducible part (i.e., bias, variance) and the irreducible part.
As the flexibility of our model increases, we know that the estimated model will have a decrease in bias and increase in variance.
notion image
Bias means simplicity; variance means sensitivity.


  • Bias is error due to overly simplistic assumptions in the model you’re using. (Underfitting)
  • When model is biased, we are forcing our data into constraints that don’t reflect the true relationship between the variables.
    • Can lead to the model underfitting your data, making it hard for it to have high predictive accuracy.
    • how far off on the average the model is from the truth
    • you have the wrong model but an accurate fit
    • help us generalize the data better and make the model less sensitive to single data points
    • decreases the training time because of the decrease in complexity of the model you’re using


  • Variance is error due to too much complexity in the model you’re using. (Overfitting)
    • Compare the model with the model
    • Can leads to the algorithm being highly sensitive to high degrees of variation in your training data, which can lead your model to overfit the data.
    • You’ll be carrying too much noise from your training data for your model to be very useful for your test data.
    • how much that the estimate varies around its average
    • you have the right model but an inaccurate fit


notion image
notion image
notion image
We use parameter to tune the algorithm, such that WE WANT TO ACHIEVE:
  • low bias (underlying pattern not too simplified)
  • low variance (not sensitive to specificities of the training data)
  • when we don't fit very hard (i.e., Underfitting), the bias is high and variance is low, because there are a few parameters being fit.
  • BUT as we increase the model complexity by moving to the right,
    • the bias goes down, because the model can adapt to more and more subtleties in the data
    • but the variance goes up because we have more and more parameters to estimate from the same amount of data
    • so bias and variance together give us prediction error and there's a trade-off. They sum together to get a prediction and the trade-off is minimized in this case.

High variance or bias

  • high variance would cause an algorithm to model the outliers/noise in the training set.
  • High bias suggest that more assumption are needed on the model you’re using.


  • Applies to all forms of supervised learning: classification, regression, and structured output learning.
    • Examples of High bias Algorithms include Linear Regression, Logistic Regression etc.
    • Examples of High variance Algorithms include Decision Tree, KNN etc.

Solution - How to fix bias and variance problems

Fixing High Bias

Adding more input features will help improve the data to fit better.
Add more polynomial features to improve the complexity of the model.
Decrease the regularization term to have a balance between bias and variance.

Fixing High Variance

Reduce the input features, use only features with more feature importance to reduce overfitting the data.
Getting more training data will help in this case, because the high variance model will not be working for an independent dataset if you have very data.


For a good model, the validation error and training error converges with larger data set, and error is low.
For a good model, the validation error and training error converges with larger data set, and error is low.
  1. The training error and validation error tends to converge with more data; but more data is not always helpful if both errors are already converged to the optimal scores.
  1. From learning curve, if the model is under fitting (low bias), then we can try increase the model complexity by adding more features, or decrease the regularisation parameter.
  1. If the model is over fitting (high variance), we can try simplify the model by setting a smaller set of features, or increase the regularisation parameter.

R code

R interpretation


notion image
notion image


Extra Reference

notion image
Bayes theorem / Bayes' ruleBootstrap