Regularization in Deep Learning

Anjani Suman
1 min readMar 1, 2021

There are 3 methods to avoid overfitting in Deep Learning:

  1. Regularization
  2. Dropout
  3. Early Stopping

In this blog I’m going to discuss only Regularization part next will discuss in another blog.

What is Regularization?

Regularization is a way to prevent overfitting.

There are two most common types of regularization used in training deep learning models. Which is L1 and L2.

  • General cost function with regularization for training is defined as:
  • Cost function = Loss + Regularization term
  • Due to this regularization term, the numerical values of weight decreases because it assumes that a neural network with smaller weights leads to simpler models.
  • So this helps to reduce overfitting.

Regularization: L1 & L2:

L1 regularizer: Cost function = Loss + λ ∑|w|

  • It penalizes absolute value of weights
  • It can make some weight to zero. So useful for model compression.
  • λ is a regularization hyper parameter. Controls the relative weight.

L2 regularizer: Cost function = Loss + λ ∑||w||²

  • It penalizes second norm of weights.
  • It is also terms as weight decay as it pushes the weights near to zero. But it does not make exactly zero always.

--

--