Skip to main content

Posts

Showing posts with the label deep learning challenges

What is the Vanishing Gradient Problem in Deep Learning?

Vanishing   Gradient   is a common problem in training deep neural networks, especially in very deep architectures. It makes it difficult for the model to learn from data during training.  What is Vanishing Gradient? In deep learning, training happens through a method called  backpropagation , where the model adjusts its weights using  gradients  (a kind of slope) of the loss function with respect to each weight. These gradients tell the model how much to change each weight to improve performance. However, in deep neural networks (many layers), the  gradients can get very small  as they are propagated backward through the layers. This is called  vanishing   gradient . As a result: Early layers (closer to the input) receive almost  no updates . The network  stops learning  or learns  very slowly . When Does Vanishing Gradient Happen? Very Deep Networks : The more layers, the more chance gradients will shrink as th...

Overfitting vs Underfitting in Deep Learning: Key Differences

When training a deep learning model, you want it to   learn patterns   from the training data so it can make   accurate predictions on new, unseen data . However, sometimes models learn too little or too much. This leads to   underfitting   or   overfitting . Let’s break them down in simple terms, backed by examples, visuals, and some light math. 1. What Is the Goal of Training a Model? Imagine you're trying to teach a model to  predict house prices  based on features like size, location, and number of rooms. Your goal is to find a function  f(x)  that maps your input features  x  (like size, rooms) to a prediction  ŷ  (the house price), such that the prediction is  close to the actual price y . ŷ = f (x;θ) MSE = (1/n) ∑ (yᵢ - ŷᵢ) ² 2. Underfitting Underfitting happens when your model is  too simple  to capture the patterns in the data. It doesn’t learn enough from the training data and performs poorl...