Vanishing Gradient Problem

Issue in deep networks where gradients become extremely small, preventing effective weight updates in early layers.