Regularization in Deep Learning
1 min readMar 1, 2021
There are 3 methods to avoid overfitting in Deep Learning:
- Regularization
- Dropout
- Early Stopping
In this blog I’m going to discuss only Regularization part next will discuss in another blog.
What is Regularization?
Regularization is a way to prevent overfitting.
There are two most common types of regularization used in training deep learning models. Which is L1 and L2.
- General cost function with regularization for training is defined as:
- Cost function = Loss + Regularization term
- Due to this regularization term, the numerical values of weight decreases because it assumes that a neural network with smaller weights leads to simpler models.
- So this helps to reduce overfitting.
Regularization: L1 & L2:
L1 regularizer: Cost function = Loss + λ ∑|w|
- It penalizes absolute value of weights
- It can make some weight to zero. So useful for model compression.
- λ is a regularization hyper parameter. Controls the relative weight.
L2 regularizer: Cost function = Loss + λ ∑||w||²
- It penalizes second norm of weights.
- It is also terms as weight decay as it pushes the weights near to zero. But it does not make exactly zero always.