AI Glossary — 50+ AI Terms Defined

AI Agent applications

An AI system that perceives its environment, makes decisions, and takes actions — often using tools like web search, code execution, or external APIs — to autonomously accomplish multi-step goals. Agents go beyond single-turn question answering to execute workflows.

AI Alignment ethics

The research field focused on ensuring that AI systems pursue goals and exhibit behaviors that are safe and beneficial to humans, even as they become more capable. Alignment asks: how do we build AI that reliably does what we actually want?

AI Audit safety

A systematic evaluation of an AI system's performance, fairness, safety, and compliance with standards, conducted to identify risks and verify responsible operation.

AI Bill of Rights safety

The White House Blueprint for an AI Bill of Rights, a framework outlining five principles to protect Americans from potential harms of automated systems.

AI Compute infrastructure

The total computational resources required for training and running AI models, encompassing hardware (GPUs, TPUs), cloud services, and the energy needed to power them.

AI Governance safety

The frameworks, policies, and practices that organizations use to manage AI systems responsibly, ensuring they are ethical, compliant, safe, and aligned with business objectives.

AI Orchestration concepts

The coordination and management of multiple AI models, tools, and data sources within a unified workflow to accomplish complex tasks efficiently.

AI Readiness business

An organization's level of preparedness to successfully adopt and benefit from AI, including data quality, technical infrastructure, talent, and cultural factors.

AI Risk Management safety

The systematic process of identifying, assessing, mitigating, and monitoring risks associated with AI systems throughout their lifecycle.

AI Watermarking safety

Techniques that embed hidden, detectable signals in AI-generated content to identify it as machine-created, supporting content authenticity and combating misinformation.

AI-as-a-Service business

A business model where AI capabilities are delivered through cloud APIs on a pay-per-use basis, allowing companies to add AI features without building models from scratch.

Activation Function concepts

A mathematical function applied to a neural network's outputs at each layer that introduces non-linearity, enabling the network to learn complex patterns beyond simple linear relationships.

Active Learning techniques

A machine learning approach where the model strategically selects which unlabeled examples would be most valuable to label, minimizing the total labeling effort needed.

Adversarial Attack safety

Deliberately crafted inputs designed to fool AI models into making incorrect predictions, often through imperceptible modifications that humans wouldn't notice.

Agentic AI applications

AI systems designed to operate with significant autonomy, planning and executing multi-step tasks over extended time horizons with minimal human intervention. Agentic AI represents a shift from AI as a tool to AI as an autonomous collaborator or worker.

Agentic Workflow concepts

An AI-powered process where language models autonomously plan, execute, and iterate on multi-step tasks, using tools and making decisions with minimal human intervention.

Algorithmic Fairness safety

The study and practice of ensuring AI systems make equitable decisions across different demographic groups, avoiding discrimination based on race, gender, age, or other protected characteristics.

Arctic models

Snowflake's open enterprise-grade large language model designed for complex SQL generation, coding tasks, and instruction following.

Artificial General Intelligence foundations

A hypothetical AI system capable of performing any intellectual task that a human can perform, across all domains, without task-specific training. No AGI system exists today; current AI models excel within defined domains but lack general-purpose cognitive flexibility.

Artificial Superintelligence foundations

A theoretical AI system that surpasses the cognitive ability of all humans combined across every domain, including scientific creativity, social reasoning, and strategic planning. ASI does not exist and may be decades or more away, if achievable at all.

Attention Head concepts

An individual attention computation unit within a transformer's multi-head attention mechanism, each learning to focus on different types of relationships in the input data.

Attention Mechanism foundations

A neural network component that allows a model to dynamically focus on different parts of its input when producing each output token, assigning learned relevance scores across the full context. Self-attention, used in transformers, computes these scores between all positions in the input simultaneously.

AutoML business

Automated machine learning — tools and techniques that automate the selection, configuration, and training of ML models, reducing the need for manual expert tuning. AutoML democratizes model development by enabling non-specialists to build effective models.

Autoencoder concepts

A neural network that learns to compress data into a compact representation and then reconstruct it, useful for dimensionality reduction, denoising, and feature learning.

Autonomous System applications

An AI-powered system capable of performing tasks and making decisions in the real world with minimal or no human intervention, such as self-driving vehicles or robotic systems.

Autoregressive Model models

An AI model that generates output one token at a time, with each new token conditioned on all previous tokens.

BERT models

A language model developed by Google that understands text by reading it in both directions simultaneously, making it especially effective at understanding context and meaning.

Backpropagation techniques

The core algorithm used to train neural networks by calculating how much each parameter contributed to errors and adjusting them accordingly.

Batch Normalization techniques

A technique that normalizes the inputs to each layer of a neural network during training, making models faster to train and more stable.

Benchmark foundations

A standardized test or dataset used to measure and compare AI model performance on defined tasks. Benchmarks provide a common language for tracking progress and identifying strengths and weaknesses across models.

Bias (AI) ethics

In neural networks, a learnable offset added to a neuron's weighted input sum, allowing the model to fit data that doesn't pass through the origin. Separately, "bias" also refers to systematic errors or unfair skews in model outputs caused by imbalances in training data.

CLIP models

A model by OpenAI that learns to connect images and text by training on image-caption pairs from the internet.

CUDA infrastructure

NVIDIA's parallel computing platform and programming model that enables developers to use NVIDIA GPUs for general-purpose AI and scientific computing.

Chain-of-Thought techniques

A prompting technique where the model is instructed or shown examples of reasoning step-by-step before giving a final answer, improving performance on complex reasoning, math, and multi-step tasks. Chain-of-thought prompting essentially asks the model to "show its work."

Chain-of-Thought Reasoning techniques

A prompting technique that improves AI reasoning by encouraging the model to show its work through intermediate steps before arriving at a final answer.

Chinchilla Scaling concepts

A scaling analysis from DeepMind showing that optimal model training should balance model size and training data equally, rather than favoring larger models trained on less data.

Claude models

A family of AI assistants built by Anthropic, designed with a focus on being helpful, harmless, and honest using Constitutional AI training methods.

Codex models

An OpenAI model specialized in understanding and generating source code across dozens of programming languages, powering tools like GitHub Copilot.

Collaborative Filtering techniques

A recommendation technique that predicts a user's preferences by finding patterns among many users' behaviors, based on the idea that people with similar tastes will like similar things.

Command R models

Cohere's enterprise-focused language model optimized for retrieval-augmented generation and business applications with strong multilingual support.

Computer Vision foundations

The AI field concerned with teaching machines to interpret and understand visual information from images and video. Core tasks include image classification, object detection, semantic segmentation, and visual question answering.

Computer-Aided Diagnosis applications

AI systems that assist healthcare professionals in diagnosing diseases by analyzing medical images, lab results, and patient data to identify patterns and anomalies.

Constitutional AI ethics

An alignment technique where AI is trained to follow a set of principles (a constitution) that define helpful, harmless, and honest behavior.

Containerization infrastructure

A technology that packages applications and their dependencies into isolated, portable units called containers, ensuring consistent deployment across different computing environments.

Content Authentication safety

Technologies and standards that verify the origin and integrity of digital content, helping users determine whether media is authentic, AI-generated, or manipulated.

Context Window infrastructure

The maximum amount of text an AI model can process in a single conversation, measured in tokens.

Contrastive Learning techniques

A self-supervised technique that teaches models to recognize similarities and differences by learning which data points are related and which are not.

Conversational AI applications

AI systems designed to engage in natural, human-like dialogue, including chatbots, virtual assistants, and voice-based interfaces.

Convolutional Neural Network concepts

A type of neural network designed to process grid-structured data like images by using filters that detect spatial patterns such as edges, textures, and shapes.

Cross-Validation techniques

A statistical method for evaluating model performance by training and testing on different subsets of data to ensure results are reliable and not due to chance.

Curriculum Learning techniques

A training strategy that presents examples to a model in a meaningful order, typically starting with easier examples and progressively increasing difficulty.

DALL-E models

OpenAI's image generation model that creates original images and art from natural language text descriptions.

DPO (Direct Preference Optimization) techniques

A training technique that aligns AI models to human preferences without needing a separate reward model, simplifying the RLHF process.

Data Augmentation techniques

A technique that artificially expands training datasets by creating modified versions of existing data, improving model performance without collecting new data.

Data Drift concepts

A gradual change in the statistical properties of input data over time that can degrade AI model performance because the real-world data no longer matches what the model was trained on.

Data Lakehouse infrastructure

A modern data architecture that combines the flexibility of data lakes with the management features of data warehouses, supporting both analytics and AI workloads.

Data Pipeline infrastructure

An automated system that collects, processes, transforms, and delivers data from various sources to storage or analytics systems, forming the foundation of AI data infrastructure.

Datasheet safety

A standardized document describing a dataset's composition, collection methodology, intended uses, and potential biases, promoting transparency in AI training data.

Deepfake ethics

Synthetic media — typically video or audio — in which a person's likeness or voice is convincingly replaced or fabricated using AI, often without their consent. Deepfakes pose serious risks for misinformation, fraud, and non-consensual intimate imagery.

Diffusion Model models

A generative AI model that learns to create data — most commonly images — by gradually reversing a noise-addition process. During training, the model learns to denoise progressively noisier versions of real images; at inference it starts from pure noise and iteratively refines a coherent output.

Digital Twin applications

A virtual replica of a physical object, process, or system that uses real-time data and AI to simulate, predict, and optimize real-world performance.

Document AI applications

AI systems that understand, extract, and process information from unstructured documents like contracts, invoices, reports, and forms.

Dropout techniques

A regularization technique that randomly deactivates a percentage of neurons during training to prevent overfitting and improve model generalization.

EU AI Act safety

The European Union's comprehensive regulation for artificial intelligence, establishing a risk-based framework that imposes requirements on AI systems based on their potential for harm.

Edge AI applications

Running AI model inference directly on local devices — smartphones, cameras, sensors, vehicles — rather than sending data to cloud servers. Edge AI reduces latency, preserves privacy, and enables AI in offline or bandwidth-constrained environments.

Embedding foundations

A dense numerical vector representation of a word, sentence, image, or other data object that captures its semantic meaning in a high-dimensional space. Items with similar meanings have vectors that are close together by distance metrics like cosine similarity.

Embedding Model concepts

A specialized AI model that converts text, images, or other data into numerical vectors that capture semantic meaning, enabling similarity comparison and retrieval tasks.

Emergent Abilities concepts

Capabilities that appear unexpectedly in AI models once they reach a certain scale, performing tasks they were never explicitly trained to do.

Encoder-Decoder models

An AI architecture with two parts: an encoder that processes the input and a decoder that generates the output.

Explainable AI (XAI) safety

Methods and techniques that make AI system decisions understandable to humans, enabling users to know why a model produced a particular output.

FLOPS infrastructure

Floating Point Operations Per Second, a measure of computational performance used to quantify the processing power of AI hardware and the computational cost of training models.

Falcon models

An open-source large language model developed by the Technology Innovation Institute in Abu Dhabi, known for its strong performance and permissive licensing.

Faster R-CNN models

A two-stage object detection model that first proposes candidate regions in an image and then classifies each region, offering high accuracy for detection tasks.

Feature Engineering techniques

The process of creating, selecting, and transforming input variables to improve a machine learning model's ability to learn patterns and make accurate predictions.

Feature Store infrastructure

A centralized platform for storing, managing, and serving machine learning features, ensuring consistency between training and production environments.

Federated Learning techniques

A distributed machine learning approach where model training happens locally on user devices, and only model weight updates — not raw data — are shared with a central server. Federated learning enables AI improvement while preserving user data privacy.

Few-Shot Learning techniques

The ability of a model to perform a new task correctly after seeing only a small number of examples in the prompt, without updating its weights. Few-shot learning is a key emergent capability of large pre-trained models.

Fine-Tuning techniques

The process of continuing training a pre-trained model on a smaller, task-specific dataset to adapt it to a particular domain or behavior. Fine-tuning requires far less compute than training from scratch because the model already has broad world knowledge.

Flash Attention techniques

A hardware-aware algorithm that computes attention much faster and with less memory by optimizing how data moves between GPU memory levels.

Flow Model concepts

A type of generative model that learns to transform simple distributions into complex data distributions through a series of invertible transformations.

Function Calling applications

A specific implementation of tool use where the AI model generates structured JSON to invoke a developer-defined function, enabling reliable integration between LLMs and application code. Function calling makes LLM outputs machine-parseable and action-ready.

GPT models

Generative Pre-trained Transformer — a family of large language models developed by OpenAI that generate coherent text by predicting subsequent tokens. GPT models popularized the pre-train-then-fine-tune paradigm that now dominates NLP research.

GPT-4 models

OpenAI's large multimodal model capable of processing both text and images, widely considered one of the most capable commercial AI systems available.

GPU infrastructure

Graphics Processing Units, originally designed for rendering graphics, now the primary hardware for training and running AI models due to their ability to perform many calculations simultaneously.

GRU concepts

A simplified variant of LSTM that uses fewer gates to capture sequential dependencies, offering similar performance with reduced computational cost.

Gemini models

Google DeepMind's multimodal AI model family designed to understand and generate text, images, code, audio, and video natively.

Generative Adversarial Network models

A deep learning framework consisting of two networks — a generator and a discriminator — that compete against each other, pushing the generator to produce increasingly realistic synthetic data. Introduced by Ian Goodfellow in 2014, GANs were the dominant generative image model before diffusion models.

Gradient Descent techniques

An optimization algorithm that iteratively adjusts model parameters in the direction that most reduces prediction errors, forming the basis of how AI models learn.

Granite models

IBM's family of enterprise AI models built for business use cases with strong focus on trust, transparency, and regulatory compliance.

GraphRAG techniques

An advanced retrieval approach that uses knowledge graphs to enhance retrieval-augmented generation, enabling AI to reason about relationships and connections between pieces of information.

Grok models

xAI's conversational AI model with real-time access to X (Twitter) data, designed to provide current information and engage with a direct communication style.

Grounding concepts

The process of connecting AI model outputs to verified, factual sources of information to reduce hallucination and ensure responses are based on real data.

Guardrails ethics

Technical mechanisms — filters, classifiers, constitutional rules, or system prompts — that constrain AI model outputs to be safe, appropriate, and within defined policy limits. Guardrails are a practical layer of safety on top of model-level training.

HBM (High Bandwidth Memory) infrastructure

A type of high-performance memory stacked vertically on or near the processor, providing the massive memory bandwidth that AI accelerators need for training and inference.

Hallucination ethics

When an AI model generates text that is factually incorrect, fabricated, or unsupported by its input, while presenting it with apparent confidence. Hallucination is a fundamental reliability challenge for large language models.

Hybrid Search techniques

A search approach that combines traditional keyword matching with AI-powered semantic search to leverage the strengths of both methods for more accurate results.

Hyperparameter Tuning techniques

The process of finding the optimal configuration settings for a machine learning model, such as learning rate and model size, that cannot be learned from data alone.

Image Classification applications

A computer vision task where an AI model assigns a category label to an entire image, such as identifying whether a photo contains a cat, dog, or car.

In-Context Learning techniques

The ability of large language models to learn new tasks from examples provided directly in the prompt, without any parameter updates or additional training.

Inference foundations

The process of running a trained AI model on new input data to produce a prediction or output, as opposed to training. Inference is what happens every time you send a message to a chatbot or submit an image for analysis.

InfiniBand infrastructure

A high-speed networking technology used to connect GPUs and servers in AI training clusters, providing the low-latency, high-bandwidth communication essential for distributed training.

Instruction Tuning techniques

A fine-tuning process that trains language models to follow natural language instructions, making them more useful and controllable as assistants.

Jailbreak safety

Techniques that attempt to bypass an AI model's safety restrictions and guidelines, tricking it into producing content it was designed to refuse.

KV Cache techniques

A memory optimization for transformer models that stores previously computed key and value tensors so they don't need to be recalculated for each new token generated.

Knowledge Distillation techniques

A technique where a smaller 'student' model is trained to mimic the behavior of a larger 'teacher' model, creating a compact model that retains much of the original's capability.

Knowledge Graph foundations

A structured representation of entities and the relationships between them, stored as a graph of nodes (entities) and edges (relationships). Knowledge graphs encode factual world knowledge in a machine-queryable form that complements the statistical knowledge inside LLMs.

Kubernetes infrastructure

An open-source platform for automating the deployment, scaling, and management of containerized applications, widely used for orchestrating AI workloads at scale.

LSTM concepts

A specialized recurrent neural network architecture that uses gate mechanisms to effectively learn and remember long-range patterns in sequential data.

LangChain infrastructure

An open-source framework for building applications powered by language models, providing tools for chaining AI operations, managing memory, and integrating external data sources.

Large Language Model models

A type of AI model trained on vast amounts of text data to understand and generate human language. LLMs predict the next token in a sequence, a simple objective that gives rise to surprisingly broad reasoning and language capabilities.

Latency infrastructure

The time delay between sending a request to an AI model and receiving the first token of the response.

Llama models

Meta's family of open-weight large language models that can be freely downloaded, modified, and deployed by developers and businesses.

LlamaIndex infrastructure

A data framework for building LLM applications that specializes in connecting language models to diverse data sources through efficient indexing and retrieval.

LoRA (Low-Rank Adaptation) techniques

A parameter-efficient fine-tuning method that trains only a small number of additional parameters instead of the entire model.

MLOps business

The discipline of applying DevOps principles — continuous integration, deployment, monitoring, and automation — to the machine learning lifecycle. MLOps bridges the gap between experimental data science and reliable production AI systems.

MMLU foundations

Massive Multitask Language Understanding — a benchmark covering 57 academic subjects from elementary mathematics to professional law and medicine, used to measure the breadth of world knowledge in language models. MMLU has become one of the most widely cited evaluation suites for LLM capability.

Machine Translation applications

AI systems that automatically translate text or speech from one natural language to another, enabling cross-language communication at scale.

Meta-Learning techniques

An approach where AI systems learn how to learn, enabling them to quickly adapt to new tasks with very few examples by leveraging experience from previous tasks.

Mistral models

A French AI company and its family of efficient open-weight language models known for strong performance relative to their size.

Mixture of Agents concepts

An AI architecture where multiple language models collaborate by iteratively refining each other's outputs, producing higher-quality responses than any single model alone.

Mixture of Experts (MoE) models

An architecture where multiple specialized sub-networks (experts) handle different types of inputs, with a router deciding which experts to activate for each request.

Model Card safety

A standardized document that describes a machine learning model's intended use, performance characteristics, limitations, and ethical considerations.

Model Context Protocol applications

An open standard developed by Anthropic that defines a universal interface for connecting AI models to external tools, data sources, and services. MCP replaces ad-hoc integrations with a consistent protocol any model or application can implement.

Model Distillation techniques

The process of training a smaller, faster AI model to replicate the behavior of a larger, more capable model.

Model Evaluation concepts

The systematic process of measuring an AI model's performance, reliability, fairness, and safety using benchmarks, test datasets, and human evaluation to determine fitness for deployment.

Model Serving infrastructure

The infrastructure and systems responsible for deploying trained AI models to handle real-time prediction requests from applications and users in production.

Multi-Agent System concepts

An AI architecture where multiple specialized AI agents collaborate, communicate, and coordinate to solve complex tasks that no single agent could handle alone.

Multimodal AI models

AI systems that can process and generate multiple types of data — such as text, images, audio, and video — within a single model. Multimodal models can answer questions about images, generate images from text, or transcribe and summarize audio.

Named Entity Recognition applications

An NLP technique that automatically identifies and classifies named entities in text, such as people, organizations, locations, dates, and monetary values.

Natural Language Processing foundations

The field of AI focused on enabling computers to understand, interpret, and generate human language. NLP encompasses tasks such as translation, summarization, sentiment analysis, question answering, and named-entity recognition.

Neural Architecture Search techniques

An automated approach to designing neural network architectures, using algorithms to discover optimal model structures instead of relying on human intuition.

Neural Network foundations

A computational system loosely inspired by the human brain, composed of layers of interconnected nodes (neurons) that learn to recognize patterns from data. Adjusting the connection weights during training allows the network to improve its predictions over time.

OCR (Optical Character Recognition) applications

AI that reads text from images, scanned documents, and photographs and converts it to editable digital text.

ONNX infrastructure

An open format for representing machine learning models that enables interoperability between different AI frameworks like PyTorch, TensorFlow, and specialized inference engines.

Object Detection applications

A computer vision task that identifies and locates multiple objects within an image, drawing bounding boxes around each detected object and classifying what it is.

Ontology foundations

A formal specification of concepts, categories, and the relationships between them within a domain, providing a shared vocabulary that machines can reason over. Ontologies are the structured knowledge layer beneath knowledge graphs.

PaLM models

Google's Pathways Language Model, a large-scale model that demonstrated breakthrough capabilities in reasoning, code generation, and multilingual tasks.

Parameter foundations

A learnable numerical value inside a neural network — typically a weight or bias — that is adjusted during training to minimize the model's prediction error. The count of parameters is the most common shorthand for model size.

Perplexity foundations

A statistical measure of how well a language model predicts a sample of text — lower perplexity means the model assigns higher probability to the actual next tokens and is therefore a better fit. Perplexity is a standard intrinsic evaluation metric for language models.

Perplexity (Metric) foundations

A measurement of how well a language model predicts text — lower perplexity means the model is more confident and accurate in its predictions.

Phi models

Microsoft's family of small language models that achieve surprisingly strong performance despite having far fewer parameters than larger competitors.

Pose Estimation applications

A computer vision technique that detects the position and orientation of a person's body parts in images or video, mapping out the skeleton of human poses.

Predictive Analytics applications

The use of data, statistical algorithms, and machine learning to identify the likelihood of future outcomes based on historical patterns.

Preference Optimization techniques

Training techniques that align AI model outputs with human preferences by learning from comparisons of better and worse responses rather than from absolute labels.

Prescriptive Analytics applications

An advanced form of analytics that goes beyond predicting outcomes to recommending specific actions and strategies to achieve desired results.

Prompt Engineering techniques

The practice of crafting input text to elicit desired outputs from a language model, without changing the model weights. Effective prompts can dramatically improve accuracy, tone, format, and reasoning quality.

Prompt Injection safety

A security vulnerability where an attacker crafts inputs that cause an AI system to ignore its original instructions and follow the attacker's instructions instead.

Prompt Template concepts

A structured, reusable prompt format with placeholder variables that standardizes how applications interact with language models, ensuring consistent and reliable outputs.

Pruning techniques

A model compression technique that removes unnecessary parameters or connections from a neural network to make it smaller and faster without significantly hurting performance.

Quantization infrastructure

A technique that reduces the precision of numbers in an AI model (e.g., from 32-bit to 4-bit) to make it smaller and faster without significant quality loss.

RLHF vs DPO techniques

Two competing approaches for aligning AI models with human preferences: RLHF uses a separate reward model and reinforcement learning, while DPO directly optimizes on preference data without a reward model.

Reasoning Model concepts

An AI model specifically designed or trained to perform complex logical reasoning, mathematical problem-solving, and multi-step analytical thinking.

Recommendation System applications

An AI system that predicts and suggests items, content, or actions that a user is likely to find relevant or interesting based on their behavior and preferences.

Recurrent Neural Network concepts

A neural network architecture designed to process sequential data by maintaining a hidden state that carries information from previous steps in the sequence.

Red Teaming ethics

A structured adversarial testing process where people or automated systems attempt to find failure modes, safety vulnerabilities, or policy violations in an AI model before it is deployed. Red teaming is a core practice in responsible AI development.

Reinforcement Learning techniques

A machine learning paradigm where an agent learns by taking actions in an environment and receiving reward or penalty signals based on the outcomes. Over many iterations the agent develops a policy that maximizes cumulative reward.

Reinforcement Learning from Human Feedback techniques

A training method where human raters compare model outputs and their preferences are used to train a reward model, which then guides further RL training. RLHF aligns model behavior with human values and instructions far more effectively than pure supervised learning alone.

Reranking techniques

A technique that uses a specialized AI model to re-score and reorder search results for better relevance, typically applied after initial retrieval to improve result quality.

Residual Network concepts

A neural network architecture that uses shortcut connections to skip over layers, enabling the training of much deeper networks without degradation.

Responsible AI safety

An approach to developing and deploying AI that prioritizes ethical principles, fairness, transparency, privacy, safety, and positive societal impact throughout the AI lifecycle.

Retrieval techniques

The process of fetching relevant documents, passages, or data from an external store in response to a query, used to ground AI model responses in specific, up-to-date information. Retrieval is the first stage of RAG pipelines and semantic search systems.

Retrieval-Augmented Generation techniques

A technique that combines a language model with a live retrieval system, fetching relevant documents from an external knowledge base before generating a response. RAG grounds LLM outputs in up-to-date, verifiable facts rather than relying solely on trained parameters.

Robustness safety

The ability of an AI model to maintain reliable performance when faced with unexpected inputs, adversarial attacks, data shifts, or operating conditions outside its training distribution.

Scaling Laws concepts

Empirically discovered relationships between model size, training data, compute budget, and AI model performance that help predict how capabilities improve with scale.

Self-Supervised Learning techniques

A training approach where models learn from unlabeled data by creating their own training signals, such as predicting hidden parts of the input.

Semantic Search techniques

A search approach that understands the meaning and intent behind queries rather than just matching keywords, using AI embeddings to find conceptually relevant results.

Semantic Segmentation applications

A computer vision task that classifies every pixel in an image into a category, creating a detailed map of what each part of the image represents.

Semi-Supervised Learning techniques

A training approach that combines a small amount of labeled data with a large amount of unlabeled data to improve learning efficiency and model performance.

Sentiment Analysis applications

An AI technique that automatically determines the emotional tone or opinion expressed in text, classifying it as positive, negative, or neutral.

Sora models

OpenAI's AI model that generates realistic and imaginative video clips from text descriptions, representing a major advance in video generation.

Sparse Attention techniques

A modified attention mechanism that processes only a subset of input positions instead of all pairs, dramatically reducing the computational cost of handling long sequences.

Sparse Model concepts

A neural network where most parameters are zero or inactive for any given input, reducing computation while maintaining model capacity through selective activation.

Speculative Decoding techniques

An inference acceleration technique that uses a smaller, faster model to draft multiple tokens that a larger model then verifies in parallel, speeding up text generation.

Speech Recognition applications

AI technology that converts spoken language into written text, enabling voice-controlled interfaces, transcription services, and voice-based search.

Speech to Text (STT) applications

AI that converts spoken audio into written text, also called automatic speech recognition (ASR).

Stable Diffusion models

An open-source image generation model that creates images from text descriptions using a diffusion process.

Synthetic Data techniques

Artificially generated data that mimics the statistical properties of real-world data, used to train or evaluate AI models when real data is scarce, sensitive, or imbalanced. Synthetic data is increasingly used to bootstrap model training and augment edge cases.

Synthetic Data Generation techniques

The process of using AI models to create artificial training data that mimics the statistical properties of real data, used when real data is scarce, expensive, or privacy-restricted.

System Prompt techniques

Hidden instructions given to an AI model that define its behavior, personality, and rules before the user starts interacting.

TOPS infrastructure

Tera Operations Per Second, a performance metric commonly used for AI inference chips, measuring how many trillion integer or mixed-precision operations the hardware can perform per second.

TPU infrastructure

Google's custom-designed AI accelerator chips, optimized specifically for machine learning workloads and used to train many of Google's largest models.

Temperature techniques

A setting that controls how random or creative an AI models responses are, ranging from 0 (deterministic) to 2 (highly random).

TensorRT infrastructure

NVIDIA's high-performance inference optimizer and runtime that dramatically accelerates AI model execution on NVIDIA GPUs through graph optimization and precision calibration.

Text Classification applications

An NLP task that automatically assigns predefined categories or labels to text documents, enabling organization and routing of content at scale.

Text to Speech (TTS) applications

AI that converts written text into spoken audio with natural-sounding human voices.

Throughput infrastructure

The number of tokens an AI model can generate per second, determining how fast it produces complete responses.

Token foundations

The basic unit of text that a language model processes — typically a word, sub-word, or character, depending on the tokenizer. Models read and generate sequences of tokens rather than raw characters or words.

Tokenizer infrastructure

The component that breaks text into tokens (subword units) that the AI model processes.

Tokenomics (AI Pricing) business

The pricing model used by AI API providers that charges based on the number of tokens (text units) processed, determining the cost of using language model services.

Tool Use applications

The ability of an AI model to call external functions, APIs, or services during inference, extending its capabilities beyond what is encoded in its weights alone. Tool use allows models to retrieve current information, run code, query databases, and interact with the world.

Training foundations

The process of adjusting a model's parameters by exposing it to data and minimizing a loss function using gradient-based optimization. Training is the computationally intensive phase that produces a model capable of making useful predictions.

Transfer Learning techniques

A machine learning approach where a model trained on one task or dataset is adapted for a different but related task, leveraging knowledge already encoded in its weights. Transfer learning dramatically reduces the data and compute needed for specialized applications.

Transformer foundations

A neural network architecture that uses self-attention mechanisms to process sequences of data in parallel, enabling highly efficient training at scale. It is the foundation of virtually all modern large language models.

Tree of Thought techniques

A problem-solving framework where an AI model explores multiple reasoning paths simultaneously, evaluating and pruning branches to find the best solution.

Triton Inference Server infrastructure

NVIDIA's open-source inference serving platform that supports multiple model frameworks and provides features like dynamic batching, model versioning, and ensemble execution.

U-Net concepts

A neural network architecture originally designed for medical image segmentation, featuring a symmetric encoder-decoder structure with skip connections between corresponding layers.

Variational Autoencoder concepts

A generative model that learns a probabilistic latent space from data, enabling it to generate new, realistic data samples by sampling from this learned distribution.

Vector Database applications

A database optimized for storing and querying high-dimensional embedding vectors, enabling fast approximate nearest-neighbor search at scale. Vector databases are the storage backbone of most RAG and semantic search systems.

Vision Transformer concepts

An adaptation of the transformer architecture for computer vision that processes images as sequences of patches, achieving state-of-the-art results on image tasks.

Voice Cloning applications

AI technology that replicates a specific persons voice from a small sample of their speech, enabling text-to-speech in that voice.

Weight foundations

A specific type of model parameter representing the strength of the connection between two neurons in a neural network. Weights determine how strongly one neuron's output influences the next layer's computation.

Whisper models

OpenAI's open-source automatic speech recognition model that can transcribe and translate audio across dozens of languages with high accuracy.

World Model concepts

An AI system that builds an internal representation of how the physical or virtual world works, enabling it to predict outcomes, plan actions, and reason about cause and effect.

YOLO models

A family of real-time object detection models that identify and locate multiple objects in images in a single forward pass, known for exceptional speed.

Zero-Shot Learning techniques

A model's ability to perform a task it has never seen examples of, relying entirely on instructions and pre-trained knowledge. Zero-shot capability indicates that a model has learned robust enough representations to generalize to novel problem types.

vLLM infrastructure

A high-throughput, memory-efficient inference engine specifically designed for serving large language models, featuring PagedAttention for optimal GPU memory management.