Regularization in Deep Learning

1 min readMar 1, 2021

--

There are 3 methods to avoid overfitting in Deep Learning:

Regularization
Dropout
Early Stopping

In this blog I’m going to discuss only Regularization part next will discuss in another blog.

What is Regularization?

Regularization is a way to prevent overfitting.

There are two most common types of regularization used in training deep learning models. Which is L1 and L2.

General cost function with regularization for training is defined as:
Cost function = Loss + Regularization term
Due to this regularization term, the numerical values of weight decreases because it assumes that a neural network with smaller weights leads to simpler models.
So this helps to reduce overfitting.

Regularization: L1 & L2:

L1 regularizer: Cost function = Loss + λ ∑|w|

It penalizes absolute value of weights
It can make some weight to zero. So useful for model compression.
λ is a regularization hyper parameter. Controls the relative weight.

L2 regularizer: Cost function = Loss + λ ∑||w||²

It penalizes second norm of weights.
It is also terms as weight decay as it pushes the weights near to zero. But it does not make exactly zero always.

Computer Vision

Deep Neural Networks

Anjani Suman

Written by Anjani Suman

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams