In Depth

Codex is a descendant of GPT-3 that was fine-tuned on billions of lines of publicly available source code. Released in 2021, it can understand natural language instructions and translate them into working code in over a dozen programming languages, with particular strength in Python, JavaScript, and TypeScript.

Codex powers GitHub Copilot, one of the most widely adopted AI coding assistants. It can generate entire functions from docstrings, suggest code completions, translate between programming languages, and explain existing code. Its ability to bridge natural language and code made AI-assisted programming mainstream.

While Codex as a standalone model has been superseded by more capable models like GPT-4, its legacy lives on in the approach of training code-specialized models. It demonstrated that fine-tuning language models on domain-specific data could create powerful specialized tools, influencing the development of numerous code generation models that followed.