NexChron

Token

Definition The basic unit of text that a language model processes — typically a word, sub-word, or character, depending on the tokenizer. Models read and generate sequences of tokens rather than raw characters or words.

In Depth

Modern LLMs use sub-word tokenization schemes like Byte-Pair Encoding (BPE) or SentencePiece, which balance vocabulary size against sequence length. The number of tokens in a prompt and response determines computational cost and falls within the model context window limit. A rough rule of thumb is that 1,000 tokens ≈ 750 English words, though this varies by language and tokenizer.

Browse more terms

AI Agent AI Alignment AI Audit AI Bill of Rights AI Compute AI Governance AI Orchestration AI Readiness AI Risk Management AI Watermarking AI-as-a-Service Activation Function Active Learning Adversarial Attack Agentic AI Agentic Workflow Algorithmic Fairness Arctic Artificial General Intelligence Artificial Superintelligence