What is Cross-Validation?

Cross-Validation — AI Glossary

Cross-Validation

Definition A statistical method for evaluating model performance by training and testing on different subsets of data to ensure results are reliable and not due to chance.

In Depth

Cross-validation is a resampling technique used to assess how well a machine learning model will generalize to new, unseen data. The most common approach, k-fold cross-validation, divides the dataset into k equal parts (folds). The model is trained on k-1 folds and tested on the remaining fold, repeating this process k times so each fold serves as the test set exactly once.

The primary benefit of cross-validation is that it provides a more reliable estimate of model performance than a single train-test split. By averaging results across multiple folds, it reduces the impact of lucky or unlucky data splits. It also helps detect overfitting: a model that performs well on training data but poorly across cross-validation folds is likely memorizing rather than learning generalizable patterns.

Common variants include stratified k-fold (preserving class proportions in each fold), leave-one-out (where k equals the number of samples), and time-series cross-validation (respecting temporal ordering). Cross-validation is essential during model selection and hyperparameter tuning, helping practitioners choose the best model configuration with confidence.

Cross-Validation

In Depth

Browse more terms