Skip to main content

Posts

Showing posts with the label batch normalization

What is the Vanishing Gradient Problem in Deep Learning?

Vanishing   Gradient   is a common problem in training deep neural networks, especially in very deep architectures. It makes it difficult for the model to learn from data during training.  What is Vanishing Gradient? In deep learning, training happens through a method called  backpropagation , where the model adjusts its weights using  gradients  (a kind of slope) of the loss function with respect to each weight. These gradients tell the model how much to change each weight to improve performance. However, in deep neural networks (many layers), the  gradients can get very small  as they are propagated backward through the layers. This is called  vanishing   gradient . As a result: Early layers (closer to the input) receive almost  no updates . The network  stops learning  or learns  very slowly . When Does Vanishing Gradient Happen? Very Deep Networks : The more layers, the more chance gradients will shrink as th...