In Depth

Computer vision made its modern leap with AlexNet in 2012, which demonstrated that deep convolutional neural networks outperform hand-crafted feature pipelines. Vision transformers (ViT) later applied the transformer architecture to image patches, matching or exceeding CNN performance. Computer vision powers autonomous vehicles, medical imaging diagnostics, facial recognition, and manufacturing quality control.