Understanding Bias and Variance: The Two Main Sources of Error in Machine Learning
June 4, 2025

Understanding Bias and Variance: The Two Main Sources of Error in Machine Learning
Machine learning algorithms make errors. These errors generally arise from two core sources: bias and variance. Understanding these concepts is crucial for improving your model’s performance and deciding when to add more data or adjust your model.
What Are Bias and Variance?
- Bias measures the error your model makes on the training data, reflecting how well it fits the data it has seen. High bias means the model is underfitting—too simple to capture the patterns in the training set.
- Variance measures how much the model’s performance worsens on new, unseen data compared to the training set. High variance indicates overfitting—your model fits the training data too closely but fails to generalize.
Real-World Examples
- If your training error is low (e.g., 1%) but dev error is high (e.g., 11%), your model has low bias and high variance. It overfits the training data and struggles to generalize.
- If training error and dev error are both high and close (e.g., 15% and 16%), your model has high bias and low variance. It underfits and cannot learn the underlying patterns well.
- Both high bias and high variance occur when training error is high and dev error is significantly worse (e.g., 15% vs. 30%).
- Low bias and low variance show in low training and dev errors (e.g., 0.5% and 1%), indicating strong performance.
The Optimal Error Rate
Every problem has an optimal error rate, also called the Bayes error rate. This represents the best possible error, often linked to inherent noise or ambiguity in data.
- For instance, speech recognition systems may have an optimal error rate around 14% due to noisy audio clips that even humans struggle to interpret.
- Your avoidable bias is how much your training error exceeds the optimal error rate.
- Variance is the difference between dev error and training error.
Knowing the optimal error helps you decide whether to focus on reducing bias or variance.
How to Address Bias and Variance
- Reduce Bias (Underfitting):
- Increase model complexity (more layers or neurons in neural networks).
- Add or improve input features based on error analysis.
- Reduce regularization (though this may increase variance).
- Modify model architecture to better fit the problem.
- Adding more training data typically does not reduce bias.
- Reduce Variance (Overfitting):
- Add more training data.
- Use regularization techniques like L2, L1, or dropout.
- Employ early stopping during training to prevent overfitting.
- Perform feature selection to reduce irrelevant inputs (useful especially with limited data).
- Decrease model size cautiously (less preferred than regularization).
- Adjust model architecture.
The Bias-Variance Tradeoff
Many changes to your model will improve bias but worsen variance, or vice versa. For example, increasing model size usually reduces bias but can increase variance unless regularization is applied.
Modern deep learning with abundant data and effective regularization reduces this tradeoff, allowing you to improve bias without a large variance penalty.
The Role of Training Set Performance
Your model must perform well on the training data before it can generalize. Conduct error analysis on training examples to identify specific problems causing bias, such as noisy data or insufficient features.
Comparing your model’s performance with human-level accuracy can help estimate the optimal error rate and guide your improvement efforts.
Summary
- Analyze training and dev errors to estimate bias and variance.
- Use these estimates to decide whether to add data, increase model size, or apply regularization.
- Know the optimal error rate to set realistic expectations.
- Perform targeted error analysis to guide feature and architecture improvements.
- Use regularization and early stopping to balance bias and variance effectively.
Understanding and managing bias and variance will help you build models that not only fit your training data but also generalize well to new data, leading to better real-world performance.
Amr Abdelkarem
About me
No Comments