In Depth
Chain-of-thought (CoT) reasoning elicits step-by-step thinking from language models, similar to how a student might show their work on a math problem. By including phrases like 'let's think step by step' or providing examples with intermediate reasoning, the model breaks down complex problems into manageable steps, dramatically improving accuracy on tasks requiring multi-step logic.
CoT was demonstrated by Google researchers in 2022, showing that prompting large models to reason through steps could improve performance on math, common sense, and symbolic reasoning tasks by large margins. The technique works because it allows the model to use its intermediate outputs as a form of working memory, allocating more computation to harder problems. Zero-shot CoT (just adding 'let's think step by step') and few-shot CoT (providing worked examples) are both effective.
Chain-of-thought reasoning has become a standard prompting technique and has influenced model training itself. Modern reasoning models are explicitly trained on reasoning traces, making CoT behavior intrinsic rather than prompt-dependent. Extensions like self-consistency (generating multiple reasoning paths and voting on the answer) and tree of thought (exploring branching reasoning paths) further improve reliability on complex reasoning tasks.