In Depth

During training, weights are updated through backpropagation: gradients of the loss flow backward through the network and each weight is nudged in the direction that reduces error. At inference, weights are frozen. Techniques like weight sharing, pruning, and quantization reduce the memory and compute footprint of storing and running large collections of weights.