What is Attention Mechanism?

NexChron

Attention Mechanism

Definition A neural network component that allows a model to dynamically focus on different parts of its input when producing each output token, assigning learned relevance scores across the full context. Self-attention, used in transformers, computes these scores between all positions in the input simultaneously.

In Depth

The attention score between two tokens is computed as the dot product of their query and key vectors, scaled and softmaxed to produce a probability distribution, then used to weight their value vectors. Multi-head attention runs several attention operations in parallel with different learned projections, capturing multiple types of relationships. The O(n²) cost of full attention over sequence length n drives ongoing research into efficient attention approximations.

Browse more terms

AI Agent AI Alignment AI Audit AI Bill of Rights AI Compute AI Governance AI Orchestration AI Readiness AI Risk Management AI Watermarking AI-as-a-Service Activation Function Active Learning Adversarial Attack Agentic AI Agentic Workflow Algorithmic Fairness Arctic Artificial General Intelligence Artificial Superintelligence