Throughput

Definition The number of tokens an AI model can generate per second, determining how fast it produces complete responses.

In Depth

Higher throughput means faster responses. Cloud-hosted models on powerful hardware can generate 50-100+ tokens per second. Local models on consumer hardware may generate 10-30 tokens per second. Throughput is a key factor in user experience and cost per request.

Browse more terms

AI Agent AI Alignment AI Audit AI Bill of Rights AI Compute AI Governance AI Orchestration AI Readiness AI Risk Management AI Watermarking AI-as-a-Service Activation Function Active Learning Adversarial Attack Agentic AI Agentic Workflow Algorithmic Fairness Arctic Artificial General Intelligence Artificial Superintelligence