In Depth
Curriculum learning, inspired by how humans learn, organizes training data in a structured progression rather than presenting it randomly. The model first learns from simple, clear examples and gradually encounters more complex or ambiguous ones. This approach can lead to faster convergence and better final performance compared to random data presentation.
The concept was formalized by Yoshua Bengio in 2009, drawing parallels to educational curricula. Defining 'difficulty' can be done in various ways: by data complexity, noise level, sequence length, or using a separate model to score example difficulty. Self-paced learning extends this by allowing the model itself to determine which examples it's ready for.
Curriculum learning is particularly effective in scenarios with noisy data, class imbalance, or very complex tasks. In natural language processing, it might mean training on shorter sentences before longer ones. In reinforcement learning, it means starting with simpler environments. The approach has seen renewed interest in the training of large language models, where data ordering and quality filtering during training can significantly impact final model capabilities.