AI Glossary: Key Terms Demystified

Navigate the world of AI with confidence. Use our interactive glossary to search and understand essential terminology. Can't find a term? Let us know!

AI Glossary: Quick FAQ

How do I find a term fast?

Use the search bar at the top of the glossary and scan related terms to jump between connected concepts.

Are these definitions technical or beginner‑friendly?

Both. We keep definitions concise and precise-with external references for deeper dives.

Can I link directly to a term?

Yes. Every term has a stable anchor like #glossary-llm that you can share or bookmark.

What if a term is missing?

Tell us-this glossary grows with the AI ecosystem. We add new entries and synonyms as the field evolves.

Browse AI Terms

Clear definitions, zero fluff. Search, filter by category or difficulty, and link concepts fast-so teams share the same language and ship with confidence.

Category
Difficulty
Jump to Letter
Showing 465 of 465 terms

3D Reconstruction

Fundamentals
Advanced
Techniques that recover a 3D shape or scene from 2D images or video, often using multi‑view geometry, NeRFs, or depth estimation.

A/B Testing

Evaluation
Intermediate
Controlled experiments comparing variants to measure impact on quality, conversion, or engagement.

Activation Functions

Fundamentals
Intermediate
Mathematical functions that determine whether a neuron should be activated (fire) based on its input. Common examples: ReLU, Sigmoid, Tanh, GELU.
0

Related terms:

AdaDelta

Training
Advanced
Adaptive learning rate optimizer that refines AdaGrad by limiting aggressive decay via running averages.

AdaGrad

Training
Advanced
Optimizer that adapts learning rates per parameter based on historical squared gradients, aiding sparse features.

Adam Optimizer

Training
Advanced
Adaptive moment estimation optimizer combining momentum and RMSProp‑like variance scaling; widely used in deep learning.

AdamW

Training
Advanced
A widely used optimizer for training transformers. AdamW decouples weight decay from gradient-based updates, often improving generalization.

Adapters

Training
Advanced
Lightweight trainable layers inserted into a frozen model to adapt it to new tasks without updating all parameters.

Related terms:

Fine‑tuningLoRAPEFT

Adjusted R‑Squared

Evaluation
Advanced
Regression metric adjusting R² for the number of predictors; penalizes overfitting and aids model comparison.

Related terms:

Adversarial Attack

Safety
Advanced
Intentionally crafted inputs designed to fool AI models into making mistakes or producing unintended outputs. Examples include adding imperceptible noise to images to cause misclassification, or prompt injections in LLMs. Important for testing model robustness.

Related terms:

JailbreakingPrompt InjectionAI Safety

Adversarial Attacks

Safety
Advanced
Intentional manipulations of AI model inputs to cause incorrect outputs. These attacks exploit model vulnerabilities and are critical for AI security research.

Related terms:

Adversarial Evaluation

Evaluation
Advanced
Evaluating models with stress tests and adversarial prompts/inputs to find failure modes (safety, jailbreaks, hallucinations, tool misuse) before production.

Adversarial Examples

Safety
Advanced
Carefully crafted inputs designed to fool AI models into making wrong predictions. A single pixel change can cause a model to misclassify images.

Affordance Learning

Training
Advanced
Learning actionable possibilities of objects or environments (what actions are feasible), often for robotics.

Agent (AI Agent)

LLM
Intermediate
An autonomous system that can perceive its environment, process information, and take actions to achieve specific goals. In the context of LLMs, an agent can use tools (like web search or APIs) to gather information and perform tasks.

Agent‑Based Modeling

Fundamentals
Advanced
Simulation approach where individual agents with simple rules interact to produce emergent system behavior.

Agentic AI

LLM
Advanced
AI systems that can autonomously plan, make decisions, and take actions to achieve goals, often using multiple tools and iterating based on feedback. Unlike simple chatbots, agentic AI can break down complex tasks, use external tools (APIs, databases), and adapt its strategy. Examples include autonomous research assistants and coding agents.

Related terms:

AI Bill of Rights (US Blueprint)

Business
Intermediate
A White House blueprint outlining principles to protect the public in automated systems: safe/effective systems, protection against discrimination, data privacy, notice/explanation, and human alternatives.

AI Legislation

Business
Intermediate
Laws and regulations governing AI development and use (e.g., EU AI Act, data protection laws, sector rules).

AI Watermarking

Safety
Intermediate
Alias for watermarking signals embedded in AI‑generated outputs to indicate AI origin.

Related terms:

Watermarking (AI Outputs)Safety (AI Safety)Governance

AI‑as‑a‑Service

Business
Beginner
Cloud delivery model where AI capabilities (APIs, hosted models) are provided on demand without managing infrastructure.

Alan Turing

Fundamentals
Beginner
Pioneer of computer science and AI; proposed the Turing Test and foundational ideas in computation.

Algorithm

Fundamentals
Beginner
A set of rules or instructions given to an AI or computer system to help it learn, solve problems, or make decisions. It is a foundational concept in computer science. Learn more on Wikipedia.

ALiBi (Attention with Linear Biases)

Fundamentals
Advanced
A positional bias method enabling better extrapolation to longer contexts by adding linear biases to attention scores.

Related terms:

Positional EncodingRoPELong‑context Models

Alignment (AI Alignment)

Safety
Advanced
Designing AI systems so their behavior is consistent with human values, safety constraints, and intended goals. In practice, this includes policy design, reinforcement learning from human feedback (RLHF), and rigorous evaluations.

Related terms:

SafetyGuardrails (AI)Evaluation

Allowlist

Safety
Intermediate
A list of explicitly permitted items (e.g., allowed tools, domains, actions). All other items are denied by default. Often used to reduce tool misuse and data exfiltration.

Andrew Ng

Fundamentals
Beginner
AI researcher and educator known for online ML courses and industrial AI adoption frameworks.

ANN

Fundamentals
Advanced
Alias for Approximate Nearest Neighbor search.

Related terms:

Anomaly Detection (Security)

Safety
Intermediate
Identifying outliers that may indicate intrusions, fraud, or system compromise using statistical or ML methods.

Related terms:

API (Application Programming Interface)

API
Beginner
A way for different software programs to communicate with each other. Many AI tools offer APIs so developers can integrate AI capabilities into their own applications.

API Key

API
Beginner
A unique authentication token that identifies and authorizes your application to use an AI service's API. API keys should be kept secret and never exposed in client-side code. They're used to track usage, enforce rate limits, and bill for API calls.

APM (Application Performance Monitoring)

Performance
Intermediate
Monitoring of application performance and dependencies using traces, metrics, and logs to diagnose latency and errors.

Approximate Nearest Neighbor (ANN)

Fundamentals
Advanced
A family of algorithms and indexes (e.g., HNSW, IVF) used to rapidly find vectors that are most similar to a query vector in high‑dimensional spaces. Core to fast vector search in embedding‑based applications.

Artificial General Intelligence (AGI)

Fundamentals
Intermediate
A hypothetical type of AI that possesses the ability to understand, learn, and apply knowledge across a wide range of tasks at a human-like level. This is distinct from current AI, which is typically specialized (Narrow AI).

Artificial Intelligence (AI)

Fundamentals
Beginner
The broad field of computer science focused on creating machines and software that can perform tasks typically requiring human intelligence, such as learning, problem-solving, speech recognition, and decision-making.

Attention Mask

LLM
Advanced
A mask applied during attention to block or down-weight certain positions (e.g., padding tokens, future tokens in causal masking).

Attention Mechanism

LLM
Advanced
A key component in Transformer models (like those used in LLMs) that allows the model to weigh the importance of different parts of the input sequence when processing information, crucial for understanding context and long-range dependencies.

Audit Logging

Safety
Intermediate
Tamper-resistant logs capturing who did what and when (requests, tool calls, data access). Critical for compliance, incident response, and forensic analysis.

Automatic Speech Recognition (ASR)

Fundamentals
Intermediate
Technology that enables computers to convert spoken language into written text. It's the core of voice assistants and dictation software.

AWQ

Performance
Advanced
Alias for Activation‑aware Weight Quantization.

AWQ (Activation-aware Weight Quantization)

Performance
Advanced
A quantization technique that preserves model quality by considering activation outliers when quantizing weights.

Related terms:

Backpressure

Performance
Advanced
Flow‑control strategy that slows producers when consumers or downstream systems are saturated to prevent overload.

Related terms:

Rate LimitingStreaming (Token Streaming)Throughput (TPS)

Backpropagation

Fundamentals
Advanced
Algorithm for computing gradients in neural networks by propagating errors backward from the output layer to input layer, enabling weight updates.

Related terms:

Batch Normalization

Training
Advanced
Technique that normalizes layer inputs during training, reducing internal covariate shift and improving training stability and speed.

Related terms:

Batch Processing

API
Intermediate
Processing multiple requests together in a single API call or job, typically with lower priority but reduced cost. Useful for non-time-sensitive tasks like bulk data analysis, content moderation, or embedding generation. Can be 50% cheaper than real-time processing.

Related terms:

APIThroughputToken Pricing

Benchmark (AI Benchmark)

Evaluation
Beginner
A standardized test or dataset used to evaluate model quality, robustness, and performance. Examples include MMLU, HELM, and custom task‑specific evals.

Related terms:

EvaluationEvalsLatency

Benchmark Drift

Evaluation
Intermediate
Shifts in measured performance due to dataset changes, model updates, or prompt/pipeline modifications.

BERT

LLM
Intermediate
Bidirectional Encoder Representations from Transformers - a pre-trained language model that understands context from both directions (left-to-right and right-to-left), excellent for comprehension tasks.

BF16 (BFloat16)

Performance
Intermediate
16-bit floating point format with a wider exponent range than FP16, often improving training stability while retaining performance benefits.

Bi‑encoder

Fundamentals
Advanced
A dual‑tower architecture that encodes query and document separately to enable fast vector similarity search.

Related terms:

Cross‑encoderRerankingEmbeddings

Bias (in AI)

Safety
Intermediate
Systematic errors in AI output that result from prejudices in the training data or algorithmic design. AI bias can lead to unfair or discriminatory outcomes against certain groups.

BLEU Score

Evaluation
Advanced
Bilingual Evaluation Understudy - a metric for evaluating machine translation quality by comparing generated text to reference translations. Measures n-gram overlap. Scores range from 0 to 1 (or 0-100). Higher is better, but BLEU has limitations for creative or diverse outputs.

Related terms:

EvaluationROUGE ScoreMachine Translation

Blue‑Green Deployment

Deployment
Intermediate
Two parallel environments (blue/green); traffic switches to green after verification, enabling quick rollback.

BM25

Fundamentals
Advanced
A classical lexical ranking function used in information retrieval that scores documents based on term frequency and inverse document frequency. Often combined with embeddings for hybrid search.

Byte Pair Encoding (BPE)

Fundamentals
Advanced
A subword tokenization algorithm that builds a vocabulary by iteratively merging common character pairs. BPE tokenizers balance vocabulary size with the ability to represent rare words.

Calibration (Model Confidence)

Evaluation
Advanced
How well a model’s confidence aligns with actual correctness. A well-calibrated model assigns higher probability to correct answers more often.

Canary Release

Deployment
Intermediate
Gradual rollout to a subset of users to validate stability and performance before full deployment.

Related terms:

Feature FlagsA/B TestingSafety (AI Safety)

Causal Masking

LLM
Advanced
An attention constraint used in decoder-only transformers so each token can only attend to earlier tokens (not future tokens). This enables autoregressive generation.

Chain of Thought (CoT)

LLM
Intermediate
A prompting technique that encourages models to show intermediate reasoning steps. Can improve accuracy on complex tasks but must be used carefully to avoid leaking sensitive reasoning.

Related terms:

Chain-of-Thought Prompting

LLM
Intermediate
A prompting technique that encourages models to show their reasoning step-by-step before giving the final answer, improving accuracy on complex reasoning tasks.

Chaos Engineering

Deployment
Advanced
Practice of injecting failures in production‑like environments to validate resilience, alerts, and recovery procedures.

Chatbot

Tools
Beginner
An AI program designed to simulate human conversation through text or voice. Used for customer service, information retrieval, and companionship.

Checkpoint

Training
Beginner
A saved snapshot of a model’s weights (and sometimes optimizer state) during or after training, enabling resume, fine-tuning, and deployment.

Chunking

LLM
Intermediate
Splitting documents into manageable segments (chunks) for indexing and retrieval in RAG systems.

Related terms:

Circuit Breaker

Performance
Advanced
Pattern that trips to fail fast when downstreams are unhealthy, preventing resource exhaustion and cascading failures.

Citations

Fundamentals
Beginner
References to sources used in research or content creation, often formatted according to a specific style guide.

Classification (in ML)

Fundamentals
Intermediate
A supervised machine learning task where the model learns to assign a predefined category or label to a given input. For example, classifying an email as 'spam' or 'not spam', or identifying an image as containing a 'cat' or a 'dog'.

Related terms:

Classifier-Free Guidance (CFG)

Fundamentals
Advanced
A conditioning technique for diffusion models that increases prompt adherence by combining conditional and unconditional predictions during sampling. Higher guidance increases alignment but may reduce diversity.

Related terms:

CLIP

LLM
Advanced
Contrastive Language-Image Pretraining - a multimodal model that understands both images and text, enabling powerful vision-language applications.

Related terms:

Clustering

Fundamentals
Intermediate
An unsupervised machine learning task where the model groups a set of unlabeled data points in such a way that objects in the same group (called a cluster) are more similar to each other than to those in other clusters. It's used to discover hidden patterns or structures in data.

Related terms:

ColBERT

Fundamentals
Advanced
A late‑interaction retrieval model that allows fine‑grained token‑level matching while remaining efficient.

Cold Start

Performance
Advanced
Initial request penalty while provisioning resources (functions, models, containers). Mitigate via warm pools, prewarming, and smaller artifacts.

Computer Vision

Fundamentals
Intermediate
A field of AI that enables computers to 'see' and interpret visual information from the world, such as images and videos, allowing them to identify objects, faces, and scenes.

Concept Drift

Evaluation
Advanced
Changes in the relationship between inputs and target outputs over time (the underlying concept changes). Even if input distributions look similar, correctness can degrade.

Confusion Matrix

Evaluation
Intermediate
Table showing the performance of a classification model by displaying true positives, true negatives, false positives, and false negatives.

Constitutional AI

Safety
Advanced
An AI safety approach developed by Anthropic where models are trained to follow a set of principles (a 'constitution') through self-critique and revision. The model learns to identify and correct harmful outputs based on these principles, reducing the need for extensive human feedback.

Related terms:

RLHFAI SafetyAlignmentEthical AI

Content Moderation

Safety
Intermediate
Filtering unsafe or policy‑violating content using classifiers, rules, or human review.

Related terms:

Context Caching

Performance
Advanced
A technique that reuses previously computed attention/key‑value states for repeated prefixes, reducing latency and cost in long or iterative prompts.

Related terms:

LatencyThroughputContext Window

Context Compression

LLM
Advanced
Techniques that reduce prompt length (summarization, distillation, selection) to fit context limits and lower cost.

Related terms:

SummarizationRAGLong‑context Models

Context Overlap

LLM
Intermediate
The amount of text shared between consecutive chunks to preserve continuity across boundaries during retrieval.

Related terms:

Context Window

LLM
Intermediate
The amount of recent text or information an AI model (especially an LLM) can 'remember' and consider when generating a response. Larger context windows allow for more coherent and relevant long conversations or document analysis.

Continuous Batching

Performance
Advanced
Serving technique where new requests can be added to an ongoing batch between decoding steps, improving throughput and GPU utilization.

ControlNet

Tools
Advanced
A technique/module that adds additional conditioning inputs (edges, depth, pose, segmentation) to control image generation in diffusion models.

Coreference Resolution

LLM
Advanced
NLP task that identifies when different words refer to the same entity (e.g., 'John' and 'he' refer to the same person).

Cosine Similarity

Fundamentals
Intermediate
Measure of similarity between two vectors by calculating the cosine of the angle between them, commonly used for text embeddings.

Cost per Token

Business
Intermediate
Granular cost metric for inference; optimize via prompt compression, caching, batching, and model choice.

Related terms:

Token (in LLMs)TokenizationBatching (LLM Serving)

Cross-encoder

Fundamentals
Advanced
A model that scores a query–document pair jointly by encoding them together, often used for high-precision reranking.

Cross‑attention

Fundamentals
Advanced
Attention mechanism where one sequence attends to another, used in encoders‑decoders and multimodal models.

Related terms:

Cross‑Entropy Loss

Training
Intermediate
The standard loss function for language modeling and classification that measures how well predicted probabilities match the true labels. Minimizing cross‑entropy is equivalent to maximizing log-likelihood.

CUDA

Performance
Intermediate
NVIDIA’s GPU computing platform and programming model. Many ML libraries and kernels (attention, GEMM) are optimized for CUDA GPUs.

DALL-E

Tools
Beginner
OpenAI's text-to-image generation model that creates realistic images from textual descriptions using diffusion models.

Data Augmentation

Training
Intermediate
A technique used to increase the diversity and size of a training dataset by creating modified copies of existing data or generating new synthetic data. For images, this might include rotating, cropping, or changing the brightness. This helps improve model performance and reduce overfitting.

Related terms:

TrainingData SetOverfittingDeep Learning

Data Deletion

Safety
Intermediate
Deleting stored user or customer data (including logs and derived artifacts) on request or after retention periods. Requires identifying downstream copies and backups.

Data Drift

Evaluation
Advanced
Changes in the input data distribution over time (e.g., new topics, user behavior shifts). Data drift can degrade retrieval quality and model performance.

Data Exfiltration (LLM Security)

Safety
Advanced
An attack goal where a model is manipulated into leaking sensitive information (secrets, system prompts, private documents) via tool use or outputs.

Data Minimization

Safety
Intermediate
Privacy principle to collect, process, and retain only data necessary for a specific purpose.

Data Parallelism

Training
Intermediate
Training approach where model replicas process different data shards in parallel and synchronize gradients.

Data Poisoning

Safety
Advanced
Maliciously corrupting training data to compromise model behavior. Attackers inject carefully designed examples that cause the model to learn incorrect patterns or backdoors. A serious concern for models trained on public or crowdsourced data.

Related terms:

Adversarial AttackTrainingAI Safety

Data Processing Agreement (DPA)

Business
Intermediate
A vendor contract that defines personal data processing terms; often used interchangeably with a Data Processing Addendum (DPA). Check scope, sub‑processors, transfers, and security controls under GDPR.

Data Residency

Business
Intermediate
The requirement to store and process data within specific geographic regions for compliance, contracts, or latency.

Data Retention

Safety
Intermediate
Policies defining how long data (prompts, logs, embeddings, outputs) is stored. Shorter retention reduces privacy risk but may limit debugging and compliance needs.

Data Set

Fundamentals
Beginner
A collection of data (e.g., images, text, numbers) used to train or evaluate an AI model. The quality, size, and diversity of the data set are critical for model performance.

Data Versioning

Deployment
Intermediate
Tracking versions of datasets and data snapshots used for training/evaluation so results are reproducible and changes are auditable.

DDIM (Denoising Diffusion Implicit Models)

Fundamentals
Advanced
A sampling method for diffusion models that can reduce the number of denoising steps and enable faster generation while maintaining quality.

DDPM (Denoising Diffusion Probabilistic Models)

Fundamentals
Advanced
A foundational diffusion formulation that learns to reverse a forward noising process step-by-step. DDPM defines the training objective and sampling process used by many diffusion models.

Related terms:

Diffusion ModelLatent DiffusionSampling (AI Sampling)

Deep Learning (DL)

Fundamentals
Intermediate
A subfield of Machine Learning that uses artificial neural networks with multiple layers ('deep' architectures) to analyze complex patterns in large datasets. It's particularly effective for tasks like image recognition and natural language processing.

Denylist

Safety
Intermediate
A list of explicitly forbidden items (e.g., blocked domains, commands, tool actions). Used alongside allowlists for layered safety controls.

Dependency Parsing

LLM
Advanced
NLP technique that analyzes grammatical structure of sentences by identifying relationships between words (subject-verb, modifier-head, etc.).

Related terms:

Deterministic Mode

API
Intermediate
A mode of operation where the output is entirely determined by the input, without any randomness or variability.

Deterministic vs Non‑deterministic Outputs

API
Intermediate
Deterministic outputs are reproducible given the same prompt and parameters (e.g., temperature=0). Non‑deterministic outputs vary due to sampling. Teams choose based on creativity vs. repeatability.

Diarization (Speaker Diarization)

Fundamentals
Advanced
Partitioning audio into speaker‑homogeneous segments (who spoke when).

Related terms:

Differential Privacy

Safety
Advanced
A mathematical framework for privacy-preserving data analysis that adds calibrated noise to prevent individual data points from being identified.

Diffusion Model

Fundamentals
Advanced
A type of generative AI model that creates data (often images) by learning to reverse a gradual noising process. It starts with random noise and refines it step-by-step into a coherent output, often guided by a text prompt (e.g., Stable Diffusion, DALL·E 3).

Direct Preference Optimization (DPO)

Training
Advanced
A fine-tuning method that learns directly from preference pairs (chosen vs rejected) without an explicit reward model or reinforcement learning loop. DPO is often simpler than RLHF while achieving strong alignment results for instruction-following behavior.

Disaster Recovery (DR)

Deployment
Advanced
Practices and tooling to restore service after major failures (region loss, data corruption). Involves backups, replication, and failover.

DLP (Data Loss Prevention)

Safety
Advanced
Policies and systems that detect and prevent sensitive data from leaving allowed boundaries (e.g., blocking secrets/PII from being sent to external tools or vendors).

DP

Safety
Advanced
Alias for Differential Privacy.

Related terms:

Differential Privacy (DP)DP‑SGD

DPA (Data Processing Addendum)

Business
Intermediate
A contractual addendum defining controller–processor obligations, sub‑processors, and cross‑border transfer terms.

DPR (Dense Passage Retrieval)

Fundamentals
Advanced
A bi‑encoder approach that learns dense embeddings for questions and passages to improve open‑domain QA retrieval.

Early Stopping

Training
Intermediate
Technique to prevent overfitting by stopping training when validation performance stops improving.

ECE (Expected Calibration Error)

Evaluation
Advanced
A metric that measures miscalibration by comparing predicted confidence to empirical accuracy across probability bins.

Related terms:

Calibration (Model Confidence)ReliabilityEvaluation

Edge AI

Deployment
Intermediate
Running AI models directly on local devices (smartphones, IoT devices, edge servers) rather than in the cloud. Offers benefits like lower latency, better privacy, offline capability, and reduced bandwidth costs. Requires optimized models through quantization and distillation.

Related terms:

Model DistillationQuantizationInferenceLatency

Embedding Vector

Fundamentals
Intermediate
A numerical representation of text, images, or audio that captures semantic meaning so similar items have nearby vectors. Used for semantic search, recommendations, clustering, and Retrieval‑Augmented Generation (RAG).

Embeddings

Fundamentals
Intermediate
Numerical representations (vectors) of words, sentences, or other data types in a multi-dimensional space. AI models use embeddings to understand semantic relationships and similarities between data points, enabling tasks like semantic search or text classification.

Encoder-Decoder Architecture

LLM
Advanced
Neural network architecture with two main components: encoder processes input into a compressed representation, decoder generates output from that representation. Used in translation and summarization.

Related terms:

Transformer ArchitectureSequence-to-SequenceAutoencoder

Ensemble Learning

Training
Intermediate
Method that combines multiple machine learning models to improve overall performance and robustness.

Related terms:

Machine Learning (ML)BaggingBoosting

Entity Linking

LLM
Advanced
Process of connecting named entities in text to unique identifiers in a knowledge base (e.g., linking 'Paris' to the city vs. the person).

Episodic Memory (Agents)

LLM
Intermediate
Memory of past interactions/events (what happened in previous sessions). Used to personalize behavior and avoid repeating mistakes.

Related terms:

Error Budget

Performance
Advanced
Allowable unreliability derived from SLOs. Consumed by incidents; gates release velocity and risk.

Ethical AI

Safety
Intermediate
A branch of ethics focused on the moral implications of AI. It addresses issues like fairness, accountability, transparency, privacy, and the societal impact of AI technologies to ensure responsible development and deployment.

EU AI Act

Business
Intermediate
A European Union regulation establishing a risk-based framework for AI systems, with obligations for high-risk uses (e.g., documentation, risk management, monitoring) and rules for certain prohibited practices.

Related terms:

AI LegislationSafety (AI Safety)Governance

Evals (Model Evaluation)

Evaluation
Intermediate
Task‑specific tests that measure quality, robustness, and safety of model outputs on real workloads. Strong evals guide model, prompt, and guardrail choices.

Evaluation Harness

Evaluation
Advanced
Automated tests and datasets to assess quality, safety, latency, and cost across model versions.

Evasion Attacks

Safety
Advanced
Adversarial attacks that manipulate input data during inference to avoid detection or change model predictions.

Experiment Tracking

Deployment
Intermediate
Logging runs, hyperparameters, datasets, metrics, and artifacts to reproduce results and compare experiments across model versions.

Explainable AI (XAI)

Safety
Advanced
A set of methods and techniques in AI aimed at making the decisions and predictions made by AI models, especially complex ones like deep neural networks, understandable and interpretable to humans. This helps build trust and allows for debugging.

Exponential Backoff

Performance
Intermediate
Retry delays grow exponentially between attempts, often with jitter, to reduce load during failures.

F1-Score

Evaluation
Intermediate
Harmonic mean of precision and recall, providing a single metric that balances both measures.

FAISS

Fundamentals
Advanced
Facebook AI Similarity Search — a library for efficient vector similarity search and clustering at scale.

Faithfulness (RAG)

Evaluation
Advanced
An evaluation dimension measuring whether generated statements are supported by the provided context. Faithfulness focuses on avoiding unsupported claims.

Fallback

Deployment
Intermediate
A reliability pattern where the system degrades gracefully by switching to a simpler or more stable behavior (e.g., a cheaper model, cached answer, or human handoff) when errors occur.

Feature Flags

Deployment
Beginner
Runtime toggles to enable, disable, or target features safely without redeploying.

Related terms:

Canary ReleaseA/B TestingSafety (AI Safety)

Feature Store

Deployment
Advanced
A system for managing, serving, and versioning ML features consistently between training and inference, improving reliability and reducing training-serving skew.

Related terms:

MLOps (Machine Learning Operations)Data PipelineTraining-serving skew

Federated Learning

Training
Advanced
A machine learning approach where models are trained across multiple decentralized devices or servers holding local data samples, without exchanging the raw data. Enables privacy-preserving AI by keeping sensitive data on-device while still benefiting from collaborative learning.

Related terms:

Privacy-Preserving AIEdge AITraining

Few-shot Learning/Prompting

LLM
Intermediate
An approach where an AI model is given a few examples (shots) of a task to learn from before it attempts the task on new input. This helps guide the model for specific outputs or styles, improving performance with limited examples.

Related terms:

Few‑shot Learning / Prompting

LLM
Intermediate
Providing the model with a handful of labeled examples inside the prompt so it can generalize the pattern. Useful when instructions alone are not sufficient.

Related terms:

Zero‑shot Learning/PromptingIn‑context LearningPrompt Engineering

Fine-tuning

Training
Advanced
The process of taking a pre-trained AI model (like a general LLM) and further training it on a smaller, domain-specific dataset. This adapts the model's knowledge and behavior for a particular task, style, or industry, making it more specialized.
0

FlashAttention

Performance
Advanced
A family of GPU-optimized attention implementations that compute attention faster and with lower memory by using tiling and IO-aware kernels. It enables longer contexts and higher throughput for transformer inference and training.

Foundation Model

Fundamentals
Advanced
A large-scale AI model (often an LLM or vision model) trained on vast amounts of broad, unlabeled data. These models are designed to be adaptable (e.g., through fine-tuning or prompting) to a wide range of downstream tasks and applications.
0

FP16 (Half Precision)

Performance
Intermediate
16-bit floating point format used to speed up training and inference with lower memory usage. Often paired with loss scaling to maintain numerical stability.

Function Calling (Tool Use)

LLM
Intermediate
The ability of an LLM to recognize when to use external tools or APIs and generate properly formatted function calls. This enables AI agents to perform actions like web searches, database queries, calculations, or API integrations. Essential for building practical AI applications.

Related terms:

Agent (AI Agent)Agentic AIStructured OutputAPI

Fusion‑in‑Decoder (FiD)

LLM
Advanced
A retrieval architecture that encodes passages independently then fuses them within the decoder for generation.

Related terms:

RAGRetrievalDecoder

GDPR (General Data Protection Regulation)

Business
Intermediate
A comprehensive data privacy law in the European Union that sets rules for collecting, processing, and storing personal information from individuals within the EU. Learn more on Wikipedia.

GEMM (General Matrix Multiply)

Performance
Advanced
A core linear algebra operation (matrix multiplication) that dominates compute cost in transformers. Highly optimized kernels are key for fast training and inference.

Generative Adversarial Network (GAN)

Fundamentals
Advanced
An AI architecture consisting of two neural networks—a generator and a discriminator—that compete against each other. The generator creates synthetic data, and the discriminator tries to distinguish it from real data, leading to increasingly realistic outputs, especially images.

Related terms:

Generative AI

Fundamentals
Beginner
A category of AI focused on creating new, original content, such as text (articles, poems), images, audio (music, speech), video, or code. It learns patterns from training data and generates novel outputs based on user prompts.

GGUF

Performance
Advanced
A model file format optimized for efficient CPU/GPU inference in the llama.cpp ecosystem.
0

GPT

LLM
Intermediate
Generative Pre-trained Transformer - decoder-only architecture optimized for text generation, foundation of models like ChatGPT.

GPTQ

Performance
Advanced
Alias for a post‑training quantization method that approximates weight updates for efficient inference.

GPU (Graphics Processing Unit)

Performance
Beginner
Specialized hardware optimized for parallel computation. GPUs are the primary accelerator for training and serving modern deep learning models.

Related terms:

GQA

Fundamentals
Advanced
Alias for Group‑Query Attention.

Gradient Accumulation

Training
Intermediate
Technique to simulate a larger batch size by accumulating gradients over multiple forward/backward passes before applying an optimizer step. Useful when GPU memory limits batch size.

Gradient Clipping

Training
Advanced
Technique to prevent exploding gradients by capping gradient values during backpropagation.

Graph Neural Networks

Fundamentals
Advanced
Neural networks designed to work with graph-structured data, learning representations of nodes, edges, and entire graphs.

Graph‑of‑Thought (GoT)

LLM
Advanced
Reasoning approach that structures intermediate steps as a graph of interconnected thoughts.

Related terms:

Gray Failure

Performance
Advanced
Partial, hard‑to‑detect degradation where systems appear up but misbehave for a subset of users or traffic.

Groundedness

Evaluation
Advanced
How well a response is anchored in verifiable sources or retrieved context. Grounded outputs can be traced to citations or documents.

Grounding (Knowledge Grounding)

LLM
Intermediate
Linking model outputs to verifiable sources or enterprise knowledge to improve factuality and trust. Typically implemented via RAG with citations.

Group‑Query Attention (GQA)

Fundamentals
Advanced
An attention variant that shares key/value across groups of heads to reduce memory while retaining quality.

Guardrails (AI)

Safety
Intermediate
Policies and technical controls that constrain model inputs/outputs to enforce safety and compliance. Examples include schema validation, content filtering, tool permissioning, and output sanitization.

Related terms:

Hallucination

Safety
Intermediate
When AI models generate plausible but incorrect or fabricated information, often with high confidence.

Related terms:

ReliabilityTruthfulnessEvaluation

Hallucination (AI)

Safety
Intermediate
A phenomenon where a generative AI model, particularly an LLM, produces outputs that sound plausible and confident but are factually incorrect, nonsensical, or not based on the provided input. Critical evaluation of AI output is necessary due to this.

Hallucination (LLMs)

Safety
Intermediate
When a model produces confident but incorrect or fabricated information. Mitigate with retrieval grounding, validation, provenance, and human review for high‑stakes tasks.

Related terms:

RAGGuardrailsEvaluation

HBM (High Bandwidth Memory)

Performance
Advanced
High-performance memory used on many data-center GPUs (e.g., H100). Higher bandwidth improves training and inference throughput for large models.

Health Checks

Deployment
Intermediate
Endpoints or probes indicating service liveness/readiness for orchestrators and load balancers.

HIPAA

Business
Intermediate
US regulation governing protected health information (PHI). For AI systems, HIPAA drives requirements like access controls, audit trails, BAAs, and data minimization.

Hit@k

Evaluation
Intermediate
Binary metric indicating whether at least one relevant item appears in the top k results. Useful when a single good context chunk is sufficient.

HNSW

Fundamentals
Advanced
Alias for Hierarchical Navigable Small World index.

HNSW (Hierarchical Navigable Small World)

Fundamentals
Advanced
A popular ANN index enabling fast approximate nearest neighbor search in high dimensions.

Related terms:

Homomorphic Encryption

Safety
Advanced
Cryptographic technique allowing computations on encrypted data without decryption, enabling privacy-preserving AI.

Related terms:

Differential PrivacyPrivacyEncrypted Computation

Hugging Face

Tools
Beginner
An ecosystem for model sharing, datasets, and tooling (Transformers, tokenizers, model hubs). Widely used for open-weight model distribution and deployment.

Human Evaluation

Evaluation
Intermediate
Manual evaluation by people using rubrics (correctness, helpfulness, safety, style). Often used to validate automatic metrics and detect edge-case failures.

HyDE (Hypothetical Document Embeddings)

LLM
Advanced
Retrieval technique that generates a hypothetical answer/document for a query and embeds it to retrieve more relevant passages. Often improves recall for complex questions.

Hyperparameter Tuning

Training
Intermediate
Process of finding optimal hyperparameters for machine learning models using techniques like grid search or Bayesian optimization.

Related terms:

Machine Learning (ML)Grid SearchBayesian Optimization

Idempotency

API
Intermediate
Designing operations that produce the same result when retried, preventing duplicates during network failures or webhook retries.

Imbalanced Data

Training
Intermediate
Situation where classes in a dataset have unequal representation, requiring special handling techniques.

Related terms:

In-context Learning

LLM
Intermediate
The ability of a large language model to learn and perform a new task based solely on the examples and instructions provided within the prompt, without needing to be retrained or fine-tuned. This is the mechanism behind few-shot and zero-shot prompting.

Related terms:

PromptFew-shot LearningZero-shot LearningLLM

In‑context Learning (ICL)

LLM
Intermediate
A model’s ability to learn patterns from examples in the prompt without weight updates. Enables fast adaptation to new tasks by demonstration.

Related terms:

Incident Severity (SEV Levels)

Deployment
Intermediate
Categorization of incident impact (e.g., SEV‑1 critical). Determines response urgency, comms, and escalation.

Indirect Prompt Injection

Safety
Advanced
A prompt injection attack where malicious instructions are embedded in external content (web pages, documents, emails) that the model later ingests during browsing or retrieval.

Inference

Fundamentals
Intermediate
The process of using a trained AI model to make predictions, generate content, classify data, or perform its designated task on new, previously unseen data. This is the 'live' operational phase of an AI model.

Inference Cost

Business
Intermediate
The computational and financial cost of running a trained model to generate predictions or outputs. Factors include model size, token count, hardware requirements, and API pricing. Optimizing inference cost is crucial for production AI applications.

Inference Time

Performance
Beginner
The duration from sending a request to receiving the complete response. Includes network latency, queue time, and actual model computation. Critical for user experience in real-time applications. Measured in milliseconds for simple tasks, seconds for complex ones.

Related terms:

LatencyStreaming ResponseThroughput

Inpainting

Tools
Intermediate
A generative technique that fills in missing or masked regions of an image while matching surrounding context. Used for edits, object removal, and iterative refinement.

Related terms:

Instruction Tuning

Training
Advanced
Fine‑tuning models on instruction‑following datasets to improve helpfulness and adherence to prompts.

Related terms:

Fine‑tuningRLHFSFT

INT4

Performance
Advanced
4-bit quantized precision used for very memory-efficient inference. Can significantly reduce VRAM usage but may require careful calibration to preserve quality.

INT8

Performance
Intermediate
8-bit integer precision commonly used for quantized inference to reduce memory and improve throughput with minimal quality loss.

ISO/IEC 42001

Business
Advanced
An international standard for AI management systems (AIMS). It provides requirements for establishing, implementing, maintaining, and continually improving AI governance and controls.

Related terms:

GovernanceSafety PolicyAI Legislation

IVF

Fundamentals
Advanced
Alias for Inverted File Index in vector search.

IVF (Inverted File Index)

Fundamentals
Advanced
A vector index that partitions vectors into coarse clusters (lists) and searches a subset for speed.

Related terms:

Jailbreak

Safety
Intermediate
An adversarial prompt that bypasses safety policies to elicit disallowed outputs.

Related terms:

Prompt InjectionSafetyRed Teaming

Jailbreaking (AI)

Safety
Intermediate
Techniques to bypass an AI model's safety guardrails and content policies, often through carefully crafted prompts that trick the model into generating prohibited content. AI companies continuously work to patch jailbreaks, but it remains an ongoing challenge.

Related terms:

Prompt InjectionGuardrailsAI SafetyRed Teaming

JSON Mode

LLM
Intermediate
A generation mode or constraint where the model is instructed (or enforced) to output valid JSON. Often paired with schema validation to improve reliability for downstream automation.

Related terms:

JSON Schema

API
Intermediate
A specification for defining the structure and validation of JSON data, often used for data exchange and API documentation.

Related terms:

Structured OutputFunction Calling (Tool Use)APIs

K-Fold Cross-Validation

Evaluation
Intermediate
Technique that divides data into k subsets, using each as validation set once while others serve as training data.

Kernel Fusion

Performance
Advanced
Combining multiple GPU operations into a single kernel to reduce memory bandwidth overhead and improve throughput.

Knowledge Cutoff

Fundamentals
Beginner
The most recent date of data used to train a model. Facts after this date may be unknown to the model unless provided via retrieval or browsing.

Related terms:

Knowledge Distillation

Training
Advanced
Training a smaller "student" model to replicate the behavior of a larger "teacher" model, improving efficiency by reducing size and inference cost while retaining performance.

Related terms:

Fine‑tuningCompressionEvaluationModel CompressionPruningQuantization

Knowledge Graph

Fundamentals
Intermediate
Structured representation of knowledge as entities and relationships that supports semantic search, reasoning, and retrieval. Used for question answering, entity linking, and enterprise knowledge systems.

Related terms:

KV Cache (Key‑Value Cache)

Performance
Advanced
Cached attention key/value tensors reused across decoding steps to avoid recomputation and reduce latency and cost.
0

KV Cache Eviction

Performance
Advanced
Policies for discarding old key/value attention states in long sessions to manage memory.

Related terms:

KV Cache (Key-Value Cache)Long‑context ModelsLatency

Large Language Model (LLM)

LLM
Beginner
An advanced AI model based on the Transformer architecture trained on large corpora to understand and generate natural language. LLMs power tasks like summarization, coding, and agents. Example: GPT-4, Claude 3, Llama 3.

Latency (AI Systems)

Performance
Beginner
The time between sending a request and receiving a response. Optimizations span model choice, prompt size, retrieval efficiency, batching, and parallelization.

Related terms:

Latent Diffusion

Fundamentals
Advanced
A diffusion approach that performs the denoising process in a compressed latent space (rather than pixel space), improving efficiency while preserving quality. Stable Diffusion is a prominent example.

Latent Space

Fundamentals
Advanced
A compressed, abstract, multi-dimensional representation of data learned by an AI model. In this space, similar data points are closer together. Generative models like GANs and Diffusion Models operate in this latent space to create new data by manipulating these compressed representations.
0

Related terms:

EmbeddingsGenerative AIDiffusion ModelGANAutoencoderDimensionality Reduction

Layer Normalization (LayerNorm)

Fundamentals
Advanced
A normalization technique that stabilizes training by normalizing activations within a layer. Commonly used throughout transformer blocks.

Leaderboard

Evaluation
Beginner
A ranking of AI models based on standardized benchmark performance. Popular leaderboards include Chatbot Arena (LMSYS), HuggingFace Open LLM Leaderboard, and MMLU. Helps compare models objectively, though real-world performance may vary.

Related terms:

BenchmarkEvaluationMMLU

Learning Rate Scheduling

Training
Intermediate
Techniques to dynamically adjust the learning rate during training, often decreasing it over time.

Related terms:

Learning RateTraining (AI Model)Optimization

Learning Rate Warmup

Training
Intermediate
Training schedule where the learning rate ramps up from a small value over the first steps/epochs to stabilize optimization, especially for transformers.

Lip‑sync (AI)

Tools
Advanced
Aligning mouth movements in video with target speech for dubbing or avatars.

Related terms:

Text‑to‑VideoTTSNeRF

LLM

Fundamentals
Beginner
Alias for Large Language Model (LLM), a transformer-based model trained on large text/code corpora.

Related terms:

LLM-as-a-Judge

Evaluation
Advanced
Using a language model to evaluate other model outputs (or its own outputs) against criteria like correctness, safety, relevance, and style. It’s widely used for scalable evals but requires careful prompt design, calibration, and spot-checking to avoid bias and false confidence.

LLMOps

Deployment
Advanced
Operational practices for deploying, monitoring, evaluating, and iterating on LLM‑powered systems.

Local Differential Privacy

Safety
Advanced
Differential privacy applied at the individual level before data sharing, providing stronger privacy guarantees.

Logit Bias

API
Advanced
A technique to bias token probabilities during decoding by adding positive or negative offsets to specific token logits. Used to encourage or forbid certain tokens.

Related terms:

Logprobs

API
Advanced
Log probabilities of tokens under the model during generation. Logprobs enable confidence scoring, debugging, and more controlled decoding.

Related terms:

Logit BiasDecodingCalibration

Long Short-Term Memory (LSTM)

Fundamentals
Advanced
Type of recurrent neural network cell designed to remember information for long periods, solving the vanishing gradient problem.

Related terms:

Recurrent Neural NetworkVanishing Gradient ProblemSequence Modeling

LoRA (Low‑Rank Adaptation)

Training
Advanced
A parameter‑efficient fine‑tuning method that injects low‑rank adapters into a frozen model, drastically reducing compute and cost while achieving strong task performance.

Related terms:

Low-Rank Factorization

Performance
Advanced
Matrix decomposition technique to reduce model parameters by approximating weight matrices with lower-rank representations.

Related terms:

Model CompressionPruningQuantization

Machine Learning (ML)

Fundamentals
Beginner
A subset of AI where systems learn from data to improve their performance on a specific task over time, without being explicitly programmed for every single scenario. It relies on algorithms and statistical models to find patterns in data.

Map‑Reduce RAG

LLM
Advanced
A pipeline pattern that answers sub‑questions per chunk (map) then aggregates into a final response (reduce).

Related terms:

RAGSummarizationQuery Planning

Mean Average Precision (mAP)

Evaluation
Advanced
Metric for evaluating object detection models, averaging precision across different recall levels.

Related terms:

Membership Inference Attacks

Safety
Advanced
Privacy attacks that determine whether a particular data sample was used in training a model.

Metadata Filtering (RAG)

LLM
Advanced
Restricting retrieval to documents matching constraints (e.g., language, date range, product, access level). Improves relevance and prevents leakage across tenants.

Mixture of Experts (MoE)

Fundamentals
Advanced
A neural network architecture where multiple specialized sub-models (experts) handle different aspects of the input, with a gating mechanism deciding which experts to activate. This allows for larger model capacity while keeping inference costs manageable. Used in models like Mixtral and Grok.

MLOps (Machine Learning Operations)

Deployment
Advanced
A set of practices that aims to deploy and maintain machine learning models in production reliably and efficiently. It combines ML, DevOps, and Data Engineering principles to manage the ML lifecycle.

MMR (Maximal Marginal Relevance)

LLM
Advanced
Diversification technique that balances relevance and novelty to reduce redundancy in retrieved passages.

Model Card

Business
Beginner
A documentation standard that provides essential information about an AI model, including its intended use, training data, limitations, biases, performance metrics, and ethical considerations. Helps users make informed decisions about model selection and deployment.

Related terms:

BenchmarkBiasEthical AI

Model Collapse

Safety
Advanced
A potential long-term problem where generative AI models, trained on data that itself was generated by other AIs, begin to lose information, forget less common patterns, and produce less diverse and more homogenous outputs over successive generations. It's a form of degenerative feedback loop.

Related terms:

TrainingData SetGenerative AIDeep Learning

Model Context Protocol (MCP)

LLM
Intermediate
An open protocol for connecting LLM applications to tools, data sources, and "context" providers in a standardized way. MCP helps make tool integration more portable across models and runtimes, similar to how HTTP standardizes web communication.

Related terms:

Function Calling (Tool Use)Agentic WorkflowStructured OutputRAG

Model Distillation (Knowledge Distillation)

Training
Advanced
A technique where a smaller 'student' model is trained to mimic the behavior of a larger 'teacher' model. This creates faster, cheaper models that retain much of the teacher's performance. Used to create efficient models for edge devices or cost-sensitive applications.

Model Drift

Evaluation
Advanced
Degradation or changes in model behavior/performance over time due to drift, data pipeline changes, prompt updates, or system modifications. Requires monitoring and periodic re-evaluation.

Model Endpoint

API
Beginner
A URL or API address where a deployed AI model can be accessed for inference. Endpoints can be hosted by AI providers (like OpenAI's API), on your own infrastructure, or on specialized platforms. Each endpoint has specific authentication, rate limits, and pricing.

Related terms:

APIInferenceDeployment

Model Extraction (Model Stealing)

Safety
Advanced
Attacks that attempt to replicate a proprietary model by querying it and training a substitute model on the outputs. Mitigations include rate limits, watermarking, and output controls.

Model Inversion

Safety
Advanced
Attack technique that reconstructs sensitive training data from model outputs or parameters.

Model Monitoring

Deployment
Intermediate
Monitoring deployed models and pipelines for quality, safety, latency, cost, and drift. Often includes alerting, evaluation on live traffic, and rollback strategies.

Model Registry

Deployment
Intermediate
A system that stores, versions, and governs models and their artifacts (weights, configs, lineage, evaluation results) for deployment and auditing.

Model Routing

Deployment
Advanced
Selecting which model (or pipeline) should handle a request based on intent, complexity, cost, latency targets, or risk level. Routing can combine fast/cheap models with fallback to stronger ones.

Model Versioning

Deployment
Intermediate
Tracking and managing model versions (weights, prompts, retrieval configs, eval results) so changes are auditable, reproducible, and safely deployable.

Model Weights

Deployment
Beginner
The learned parameters of a trained model. Weights are typically stored as checkpoint files (e.g., safetensors) and determine the model’s behavior during inference.

Moderation Classifier

Safety
Advanced
A model that detects policy‑violating content (e.g., hate, self‑harm, sexual content) in inputs or outputs to enforce safety policies.

Related terms:

MQA

Fundamentals
Advanced
Alias for Multi‑Query Attention.

MRR (Mean Reciprocal Rank)

Evaluation
Intermediate
Average reciprocal rank of the first relevant result; emphasizes early precision.

Related terms:

MTBF (Mean Time Between Failures)

Deployment
Intermediate
Average time between service failures. Improved by testing, redundancy, and robust change management.

MTEB

Evaluation
Intermediate
Alias for Massive Text Embedding Benchmark.

MTEB (Massive Text Embedding Benchmark)

Evaluation
Intermediate
A benchmark suite for evaluating embedding model quality across many tasks.

MTTR (Mean Time to Recovery)

Deployment
Intermediate
Average time to restore service after an incident. Lower via clear ownership, runbooks, and rollbacks.

Multi-Head Attention

LLM
Advanced
Attention mechanism that runs multiple attention operations in parallel, allowing models to focus on different aspects simultaneously.

Multi‑Query Attention (MQA)

Fundamentals
Advanced
An attention optimization that shares key/value across heads for lower memory and faster decoding.

Related terms:

Multimodal AI

LLM
Intermediate
AI systems capable of processing, understanding, and generating information from multiple types of data (modalities) simultaneously, such as text, images, audio, and video (e.g., GPT-4o, Gemini).
0

Related terms:

Foundation ModelGenerative AIText‑to‑ImageText‑to‑VideoLLM

Multimodal Learning

Fundamentals
Intermediate
AI systems that process and understand multiple types of data simultaneously (text, images, audio, video).

Named Entity Recognition (NER)

LLM
Intermediate
NLP task that identifies and classifies named entities in text (persons, organizations, locations, etc.).

Related terms:

NLPInformation ExtractionEntity Linking

Natural Language Processing (NLP)

Fundamentals
Intermediate
A subfield of AI focused on enabling computers to understand, interpret, process, and generate human language (both written and spoken). LLMs are a key technology driving advancements in NLP.

Natural Language Understanding (NLU)

LLM
Advanced
A component of NLP focused on the more challenging task of enabling machines to comprehend the meaning, intent, sentiment, and context of human language, beyond just syntactic parsing.

NCCL

Performance
Advanced
NVIDIA Collective Communications Library used for fast multi-GPU communication (all-reduce, broadcast), critical for distributed training.

NDCG (Normalized Discounted Cumulative Gain)

Evaluation
Advanced
A ranking metric that accounts for graded relevance and position in the list.

Negative Prompt

LLM
Intermediate
A text prompt describing what should be avoided in generated content (common in text-to-image). It steers sampling away from unwanted artifacts or styles.

Negative Sampling

Training
Advanced
Efficient training technique for large vocabularies by sampling negative examples instead of computing all possibilities.

Related terms:

NeRF

Fundamentals
Advanced
Alias for Neural Radiance Fields.

NeRF (Neural Radiance Fields)

Fundamentals
Advanced
A 3D representation learned from images to render novel views, used in graphics and video.

Neural Network

Fundamentals
Intermediate
A computational model inspired by the structure and function of the human brain, composed of interconnected processing units called 'neurons' organized in layers. Neural networks are the core of Deep Learning.

Neuro-Symbolic AI

Fundamentals
Advanced
Hybrid approach combining neural networks (for pattern recognition) with symbolic reasoning (for logic and rules).

Related terms:

NIST AI RMF (AI Risk Management Framework)

Business
Intermediate
A framework from NIST to help organizations manage AI risks across governance, mapping, measuring, and managing. Often used to structure safety, privacy, and reliability programs.

Related terms:

Safety (AI Safety)GovernanceEvaluation

NLG

Fundamentals
Intermediate
Alias for Natural Language Generation (NLG), creating text from structured or unstructured inputs.

NLP

Fundamentals
Beginner
Alias for Natural Language Processing (NLP), enabling machines to process and generate human language.

NLU

Fundamentals
Intermediate
Alias for Natural Language Understanding (NLU), focusing on intent, meaning, and context.

Observability

Performance
Intermediate
End‑to‑end tracing, logging, and metrics to understand behavior, debug incidents, and improve quality.

Related terms:

OCR

Tools
Beginner
Alias for Optical Character Recognition.

OCR (Optical Character Recognition)

Tools
Beginner
Extracting text from images or PDFs.

Related terms:

Off-policy

Training
Advanced
In reinforcement learning, learning from data collected by a different (behavior) policy, such as replay buffers. Often more sample-efficient but can be less stable.

On-policy

Training
Advanced
In reinforcement learning, learning from data collected by the current policy being optimized. Often more stable but sample-inefficient.

On‑Prem vs Cloud

Business
Intermediate
Hosting models: on‑premises provides isolation and control; cloud offers elasticity and managed services. Many enterprises use hybrid patterns.

One-Hot Encoding

Fundamentals
Beginner
Method to convert categorical variables into binary vectors, where each category becomes a separate dimension.

Related terms:

Data PreprocessingCategorical VariablesMachine Learning (ML)

ONNX

Deployment
Intermediate
Open Neural Network Exchange — a model format for exporting and running models across frameworks and runtimes.

ONNX Runtime

Deployment
Intermediate
A high-performance runtime for executing ONNX models across CPU and GPU backends.

Open Source AI

Business
Beginner
AI models, datasets, and tools whose source code, design, or data is made publicly available, often under a license that permits use, modification, and distribution. This fosters collaboration and innovation.

OPQ

Fundamentals
Advanced
Alias for Optimized Product Quantization.

OPQ (Optimized Product Quantization)

Fundamentals
Advanced
A rotation applied before PQ to reduce quantization error and improve search accuracy.

Optimizer

Training
Intermediate
Algorithms that adjust model parameters (weights) during training to minimize a loss function. Common optimizers include SGD, Adam, AdaGrad, and AdaDelta.

Out-of-Distribution (OOD)

Evaluation
Advanced
Inputs that differ significantly from the training distribution. OOD detection and robustness testing are important for safe deployment.

Output Validation

Safety
Advanced
Validating model outputs against expected formats or rules (e.g., JSON Schema, regex, allowlists). Output validation reduces silent failures and blocks unsafe or malformed responses.

Related terms:

Overfitting

Training
Intermediate
A common problem in machine learning where a model learns the training data too well, including its noise and random fluctuations. An overfit model performs exceptionally well on the data it was trained on but fails to generalize and make accurate predictions on new, unseen data.

Related terms:

TrainingMachine LearningDeep LearningData Augmentation

Overfitting vs. Underfitting

Fundamentals
Intermediate
Overfitting occurs when a model learns training data too well but fails on new data; underfitting when it fails to capture patterns.

Related terms:

Bias-Variance TradeoffTraining (AI Model)Generalization

PagedAttention

Performance
Advanced
A KV-cache management approach (popularized by vLLM) that allocates KV blocks like a paging system, reducing fragmentation and improving memory efficiency for many concurrent sequences.

Pairwise Preference Evaluation

Evaluation
Advanced
An evaluation method where raters choose between two outputs (A vs B). This supports win-rate metrics and is commonly used to build preference datasets.

Parameter (in AI models)

Fundamentals
Intermediate
Internal variables or 'weights' within an AI model, especially neural networks, that are learned and adjusted during the training process. Models with more parameters (e.g., billions in LLMs) can often learn more complex patterns and store more information.

PEFT (Parameter‑Efficient Fine‑Tuning)

Training
Advanced
Methods like LoRA, prefix‑tuning, and adapters that update a small subset of parameters to adapt models.

Related terms:

LoRAAdaptersPrompt Tuning

Penalties (Frequency/Presence)

API
Intermediate
Decoding parameters that discourage repetition by lowering probabilities of previously generated tokens.

Percentiles (p95/p99)

Performance
Intermediate
Latency distribution cutoffs indicating worst‑case experience. Track p95/p99 for endpoints, TTFB, and decode latency to detect tail issues.

Perplexity

Evaluation
Advanced
A metric measuring how well a language model predicts text. Lower perplexity indicates better prediction. Calculated as the exponential of the average negative log-likelihood. While useful for comparing models, it doesn't always correlate with human-perceived quality.

Related terms:

EvaluationBenchmarkLLM

PHI (Protected Health Information)

Safety
Intermediate
Health-related identifiable information subject to stricter controls in many jurisdictions (e.g., under HIPAA in the US). Requires strong access controls, auditing, and vendor agreements.

PII (Personally Identifiable Information)

Safety
Beginner
Information that can identify a person directly or indirectly (e.g., name, email, device identifiers). Handling PII requires privacy controls, access management, and retention policies.

Pipeline Parallelism

Deployment
Advanced
Distributing different layers/stages of a model across devices and streaming micro-batches through the pipeline.

Plan-and-Execute

LLM
Advanced
Agent pattern that first creates a high-level plan (steps/subtasks) and then executes the steps, often re-planning when new information arrives.

Related terms:

Playground

Tools
Beginner
An interactive web interface provided by AI companies (like OpenAI, Anthropic) where you can test models, adjust parameters (temperature, max tokens), and experiment with prompts without writing code. Useful for prototyping and learning how models behave.

Related terms:

APITemperatureSystem Prompt

Poisoning Attacks

Safety
Advanced
Adversarial attacks that corrupt training data to manipulate model behavior during the learning phase.

Policy Optimization

Training
Advanced
Algorithms that improve reinforcement learning policies by maximizing expected rewards.

Positional Encoding

Fundamentals
Advanced
Techniques to encode token positions for attention (sinusoidal, learned, RoPE, ALiBi).

Related terms:

PPO

Training
Advanced
Proximal Policy Optimization - a policy gradient method for reinforcement learning that ensures stable training.

PQ

Fundamentals
Advanced
Alias for Product Quantization.

PQ (Product Quantization)

Fundamentals
Advanced
A compression technique that splits vectors into subvectors and quantizes them to reduce memory.

Related terms:

Precision vs. Recall

Evaluation
Intermediate
Precision measures accuracy of positive predictions; recall measures how many actual positives were found.

Precision@k

Evaluation
Intermediate
Fraction of retrieved items in the top k that are relevant. Higher precision@k means fewer irrelevant results among top hits.

Prefill vs Decode

Performance
Advanced
Two phases of transformer inference: prefill computes attention over the prompt (often heavy compute), while decode generates tokens step-by-step (often memory/KV-cache bound). Optimizing both phases is key for low latency and high throughput.

Program‑of‑Thought (PoT)

LLM
Advanced
Reasoning technique where the model writes small programs (often Python) to compute accurate answers.

Prompt

LLM
Beginner
The instruction, question, text, or other input provided by a user to an AI model (especially a generative AI model) to guide its response or content generation. Crafting effective prompts is a skill known as 'Prompt Engineering'.

Prompt Caching

Performance
Advanced
Reusing precomputed prompt or prefix representations to reduce latency and cost for repeated or shared context.

Related terms:

KV Cache (Key-Value Cache)LatencyThroughput

Prompt Engineering

LLM
Intermediate
The practice of carefully designing and refining the input text (prompts) given to an AI model to elicit the desired output. Effective prompt engineering involves understanding the model's capabilities, using clear instructions, providing examples, and iterating on the prompt structure.

Related terms:

PromptLLMFew-shot LearningZero-shot Learning

Prompt Injection

Safety
Intermediate
Adversarial inputs that try to override system instructions or misuse tools/APIs. Defend with input sanitization, strict tool scopes, allow/deny-lists, and output validation.

Related terms:

Prompt Leakage

Safety
Advanced
Unintended disclosure of hidden instructions (system prompts), policies, or private context. Can occur via prompt injection, tool misuse, or insufficient isolation.

Prompt Library

LLM
Beginner
A collection of pre-defined prompts or templates used for generating text or other content.

Prompt Linting

LLM
Intermediate
The process of analyzing and improving the quality and effectiveness of prompts used in language models.

Prompt Logging

Safety
Intermediate
Storing prompts, retrieved context, tool calls, and model outputs for debugging and evaluation. Must be paired with redaction, retention controls, and access restrictions.

Prompt Template

LLM
Beginner
A reusable prompt pattern with placeholders for variables; standardizes instructions and improves consistency.

Provenance

Safety
Intermediate
The origin, history, and ownership of data or content, often tracked for transparency and accountability.

Pruning

Performance
Intermediate
Technique to reduce model size by removing unnecessary parameters or connections while maintaining performance.

Related terms:

PyTorch

Tools
Beginner
Popular open-source deep learning framework developed by Facebook, known for dynamic computation graphs and Python integration.

Related terms:

TensorFlowDeep Learning FrameworkMachine Learning

QLoRA

Training
Advanced
A PEFT approach that quantizes the base model and trains low‑rank adapters, enabling resource‑efficient fine‑tuning.

Related terms:

LoRAQuantizationPEFT

QLoRA (Quantized Low-Rank Adaptation)

Training
Advanced
An extremely memory-efficient fine-tuning method that combines quantization with LoRA, enabling fine-tuning of large models (like 65B parameter models) on consumer GPUs. Reduces memory requirements by up to 10x compared to standard fine-tuning.

Related terms:

Quantization

Performance
Advanced
Reducing numerical precision of model weights/activations (e.g., FP16 → INT8) to lower memory footprint and increase inference speed, often with minimal quality loss.

Related terms:

InferenceLatencyThroughput

Query Expansion

Fundamentals
Intermediate
Retrieval technique that adds related terms/synonyms to a query to improve recall, often used in lexical or hybrid search.

Query Rewriting

LLM
Intermediate
Transforming a user query into a better retrieval query (e.g., adding context, disambiguation, or keywords) to improve search relevance in RAG.

RAG

LLM
Intermediate
Alias for Retrieval‑Augmented Generation (RAG), a technique that retrieves relevant context from a knowledge source and injects it into the prompt to improve factuality and grounding.

RAG (Retrieval Augmented Generation)

LLM
Intermediate
A technique that enhances Large Language Models by allowing them to retrieve relevant information from external knowledge sources (like databases or documents) before generating a response. This helps ground the model's output in factual, up-to-date information, reducing hallucinations and improving accuracy.

RAG (Retrieval-Augmented Generation)

LLM
Intermediate
Framework combining information retrieval with text generation - retrieves relevant documents then uses them to generate more accurate responses.

RAG Fusion

LLM
Advanced
A retrieval pattern that issues multiple query variants (e.g., rewritten queries), retrieves results for each, and fuses/merges them (often with rank fusion) to improve recall and robustness.

RAGAS

Evaluation
Advanced
An evaluation framework for Retrieval-Augmented Generation that measures dimensions like faithfulness and answer relevance using LLM-assisted scoring. Useful for scalable RAG quality monitoring.

Random Seed

API
Beginner
A value used to initialize a random number generator, allowing for reproducibility and consistency in randomized processes.

Rate Limiting

API
Beginner
Restrictions on how many API requests you can make within a time period (e.g., 60 requests per minute). Rate limits prevent abuse and ensure fair resource allocation. Exceeding limits typically results in HTTP 429 errors. Important for planning application architecture.
0

Related terms:

APIAPI KeyThroughputThroughput (TPS)Batching (LLM Serving)APIs

RBAC (Role‑Based Access Control)

Safety
Intermediate
Authorization model mapping roles to permissions, reducing ad‑hoc grants. Use least privilege for APIs, webhooks, and admin tools.

Related terms:

Re-ranker

LLM
Advanced
A component that reorders initial search results using a stronger model (often a cross-encoder) for better relevance.

ReAct (Reasoning + Acting)

LLM
Advanced
A prompting framework that combines reasoning (thinking through a problem) with acting (taking actions via tools). The model alternates between reasoning steps and tool use, creating a traceable chain of thought and actions. Improves reliability of AI agents.

Related terms:

Agentic AIChain of ThoughtFunction Calling

React Prompting

LLM
Intermediate
Interactive prompting technique where the AI responds to user feedback in real-time, refining outputs through conversation.

Related terms:

Reader

LLM
Intermediate
A component that reads retrieved passages and extracts or synthesizes an answer. In modern RAG, the generator model often also acts as the reader.

Related terms:

Recall@k

Evaluation
Intermediate
Fraction of relevant items retrieved within the top k results; key metric for retrieval quality.

Related terms:

MRRNDCGReranking

Receiver Operating Characteristic (ROC)

Evaluation
Intermediate
Curve plotting true positive rate against false positive rate to evaluate binary classification models.

Red Teaming (AI)

Safety
Advanced
Structured adversarial testing of AI systems to uncover vulnerabilities (prompt injection, jailbreaks, unsafe tool actions) and improve defenses.

Redaction

Safety
Intermediate
Removing or masking sensitive information (PII, secrets) from logs, prompts, or outputs. Redaction reduces leakage risk but must preserve enough context for debugging.

Reflexion

LLM
Advanced
An agent technique where the model reflects on failures, critiques its behavior, and updates future actions based on feedback (self-improvement loop).

Related terms:

Regression (in ML)

Fundamentals
Intermediate
A supervised machine learning task where the model learns to predict a continuous numerical value. For example, predicting the price of a house based on its features (size, location) or forecasting future sales based on historical data.

Related terms:

Regularization

Training
Intermediate
Techniques to prevent overfitting by adding penalties to model complexity (L1, L2 regularization).

Related terms:

Overfitting vs. UnderfittingL1 RegularizationL2 Regularization

Reinforcement Learning

Training
Advanced
A type of Machine Learning where an AI agent learns by interacting with an environment. It receives rewards for desirable actions and penalties for undesirable ones, gradually optimizing its strategy or 'policy' to maximize cumulative reward. Learn more on Wikipedia.

Reinforcement Learning from Human Feedback (RLHF)

Training
Advanced
A technique used to align AI models, especially LLMs, more closely with human preferences and instructions. It involves collecting human feedback on model outputs and using this feedback to further train or fine-tune the model, often to improve helpfulness and reduce harmful or biased responses.
0

Reranking

LLM
Advanced
A second‑stage scoring step that re‑orders initial search results using a more powerful model (e.g., cross‑encoder), improving relevance in RAG pipelines.

Residual Networks (ResNets)

Fundamentals
Advanced
Deep neural network architecture using skip connections to ease gradient flow and enable very deep networks.

Related terms:

Retries with Jitter

Performance
Advanced
Retry strategy that randomizes delays to avoid thundering herds and contention during partial outages.

Retrieval Augmented Generation (RAG)

LLM
Intermediate
Connects a model to your knowledge base so it can cite and ground answers in your content—improving accuracy and reducing hallucinations. Example: Search Confluence → retrieve pages → generate a cited summary.

Retrieval-Augmented Generation

LLM
Intermediate
See RAG (Retrieval-Augmented Generation) - hybrid approach combining retrieval and generation for more accurate AI responses.

Retriever

LLM
Intermediate
A component in a RAG system that searches a knowledge base (lexical, vector, or hybrid) and returns candidate passages/documents for a query.

Reward Model

Training
Advanced
Component in RLHF that learns to predict human preferences and assigns scores to different AI outputs.

Related terms:

ReWOO

LLM
Advanced
Reasoning Without Observation — an agent pattern that plans tool calls and reasoning steps while minimizing intermediate observations, aiming to reduce error accumulation and cost.

RLAIF (Reinforcement Learning from AI Feedback)

Training
Advanced
A variant of RLHF where feedback/rankings are generated by AI models instead of (or in addition to) humans. It can scale preference data collection but needs robust safety and bias controls.

Related terms:

RLHF

Training
Advanced
See Reinforcement Learning from Human Feedback - training approach using human feedback to align AI with human preferences.

RLHF (Reinforcement Learning from Human Feedback)

Training
Advanced
A training technique where human evaluators rank or rate model outputs, and the model is fine-tuned using reinforcement learning to prefer outputs that humans rate highly. This is how ChatGPT and Claude were trained to be helpful, harmless, and honest.

RMSNorm

Fundamentals
Advanced
Root Mean Square Normalization — a variant of LayerNorm that normalizes by RMS without mean-centering. Used in several modern LLM architectures for efficiency.

Robustness

Evaluation
Advanced
A system’s ability to maintain performance under distribution shifts, adversarial inputs, noise, and edge cases.

ROC-AUC Score

Evaluation
Intermediate
Area under the ROC curve, measuring a model's ability to distinguish between classes.

ROCm

Performance
Intermediate
AMD’s open software stack for GPU computing. Enables training and inference on AMD GPUs.

Rollback

Deployment
Intermediate
Reverting to a previous stable version when a release degrades SLOs or causes incidents.

RoPE (Rotary Positional Embeddings)

Fundamentals
Advanced
A positional method that rotates queries/keys in complex space to encode relative positions, aiding long context.

Related terms:

Positional EncodingALiBiLong‑context Models

ROUGE Score

Evaluation
Advanced
Recall-Oriented Understudy for Gisting Evaluation - metrics for evaluating text summarization by measuring overlap between generated and reference summaries. Includes ROUGE-N (n-grams), ROUGE-L (longest common subsequence), and ROUGE-W (weighted sequences).

Related terms:

EvaluationBLEU ScoreSummarization

RPO (Recovery Point Objective)

Deployment
Advanced
Maximum acceptable data loss measured in time (how far back you can restore). Influences backup cadence and replication.

RTO (Recovery Time Objective)

Deployment
Advanced
Maximum acceptable downtime after an incident before service must be restored. Plan via DR strategies, runbooks, and canary rollouts.

Runbook

Deployment
Intermediate
Step‑by‑step operational guide to diagnose and remediate incidents. Include rollback steps and verification checks.

safetensors

Deployment
Intermediate
A safe and fast tensor serialization format (commonly used with Hugging Face) designed to avoid arbitrary code execution risks present in some pickle-based formats.

Sandboxing

Safety
Advanced
Isolating untrusted code, tools, or browsing in restricted environments to limit damage and prevent access to secrets or sensitive resources.

SBOM (Software Bill of Materials)

Safety
Intermediate
A machine-readable inventory of components and dependencies used to build software (and sometimes ML systems). SBOMs help assess vulnerabilities and supply-chain risk.

Related terms:

Supply Chain SecurityDependenciesVulnerabilities

ScaNN

Fundamentals
Advanced
A Google ANN library for efficient vector similarity search with quantization and reordering.

Related terms:

SDK (Software Development Kit)

API
Beginner
A collection of software development tools, libraries, and documentation provided by hardware or software vendors to help developers build applications for a specific platform or service (e.g., an AI tool might offer an SDK for easier API integration).

Secrets Management

Safety
Intermediate
Practices and tooling for storing and accessing secrets (API keys, tokens, credentials) securely. Prevents leaks to logs, prompts, client-side code, or third parties.

Self-Attention

LLM
Advanced
Attention mechanism that computes relationships between elements within the same sequence, enabling contextual understanding.

Related terms:

Self‑Consistency

LLM
Advanced
A reasoning technique where the model samples multiple solution paths and selects the most consistent answer (e.g., via majority vote). Often improves accuracy on reasoning tasks.

Semantic Chunking

LLM
Intermediate
The process of breaking down text or content into smaller, meaningful chunks or segments, often used for information retrieval or question answering.

Semantic Memory (Agents)

LLM
Intermediate
Long-term factual knowledge a system stores (documents, notes, structured facts). Often implemented via vector databases and retrieval.

Semantic Role Labeling

LLM
Advanced
NLP task that identifies roles played by entities in relation to predicates (who did what to whom, when, where, etc.).

Related terms:

NLPDependency ParsingInformation Extraction

SentencePiece

Fundamentals
Advanced
A tokenization library and algorithm family (often BPE or unigram) that operates directly on raw text (including whitespace), commonly used in many open-weight LLMs.

Service Credits

Business
Intermediate
Remedies offered under an SLA when uptime or SLOs are missed, typically as bill credits. Review exclusions, calculation, and cap.

SLA (Service Level Agreement)

Business
Intermediate
Contractual guarantees on uptime, response times, and remedies. Review scope, exclusions, and credit calculations.

Related terms:

SLI (Service Level Indicator)

Performance
Intermediate
Measured metric of service quality (e.g., availability, latency percentiles, TTFB). Drives SLOs and error budgets.

SLO (Service Level Objective)

Performance
Intermediate
Target reliability/latency goals for a service. Paired with SLIs and enforced via error budgets and incident response.

SOC 2

Safety
Intermediate
An independent audit framework assessing controls for Security, Availability, Processing Integrity, Confidentiality, and Privacy.

Related terms:

Safety (AI Safety)GovernanceSystem Card

Source Attribution

Fundamentals
Beginner
The practice of acknowledging and crediting the original sources of information, data, or content.

Sparse Model

Performance
Advanced
A neural network where many weights are zero or inactive, reducing computational requirements. Mixture of Experts (MoE) is a type of sparse model where only some experts activate for each input. Enables larger model capacity with manageable inference costs.

Related terms:

Mixture of ExpertsQuantizationInference Cost

Sparse Models

Performance
Advanced
Neural networks with mostly zero weights, reducing computation and memory requirements while maintaining performance.

Related terms:

PruningModel CompressionEfficiency

Speculative Decoding

Performance
Advanced
A decoding technique that uses a small, fast "draft" model to propose tokens which a larger "target" model then verifies in batches. When proposals are accepted, it reduces end-to-end latency while preserving the target model’s output distribution.

SRE (Site Reliability Engineering)

Deployment
Advanced
Engineering discipline combining software and systems thinking to achieve reliable, scalable services using SLOs, error budgets, automation, and blameless postmortems.

SSE

API
Intermediate
Alias for Server‑Sent Events streaming.

Related terms:

Server‑Sent Events (SSE)Streaming (Token Streaming)

Stable Diffusion

Tools
Intermediate
A widely used text-to-image model family based on latent diffusion. It generates images by denoising in a latent space conditioned on text embeddings, and is popular for customization via fine-tuning and control modules.

Stochastic Gradient Descent (SGD)

Fundamentals
Intermediate
Optimization algorithm that updates parameters using gradients from single training examples or small batches.

Related terms:

Gradient DescentOptimizationTraining (AI Model)

Stop Sequences

API
Intermediate
Decoding control where generation halts when the model outputs any of the specified stop strings. Used to prevent the model from continuing into unwanted sections.

Related terms:

DecodingStructured OutputTemperature (Sampling)

Stratified Sampling

Training
Intermediate
Sampling technique that maintains class proportions from the original dataset in training/validation splits.

Related terms:

Data SetImbalanced DataCross-Validation

Streaming Response

API
Intermediate
An API response mode where the model's output is sent incrementally as it's generated, rather than waiting for the complete response. This provides a better user experience by showing progress in real-time, similar to how ChatGPT displays responses word-by-word.

Related terms:

APILatencyInference

STT (Speech‑to‑Text)

Tools
Beginner
Alias of ASR — converting speech to text.

Related terms:

Supervised Fine-Tuning (SFT)

Training
Intermediate
Initial training phase in RLHF where model learns from labeled examples before preference learning.

Supervised Learning

Fundamentals
Intermediate
A type of Machine Learning where the model learns from labeled data. This means each input data point in the training set is paired with a known correct output or 'label,' allowing the model to learn the mapping between inputs and outputs. Learn more on Wikipedia.

Supply Chain Security

Safety
Advanced
Practices that reduce risks from third-party code, model artifacts, and dependencies (e.g., poisoned packages, compromised weights). Includes provenance, signature verification, and controlled publishing.

System Message

LLM
Beginner
Alias for the system prompt that sets behavior, tone, and constraints for the model.

System Prompt

LLM
Beginner
A hidden instruction that sets the model’s behavior, tone, and constraints. Keeping it stable improves consistency and makes changes auditable.

Related terms:

TCO (Total Cost of Ownership)

Business
Intermediate
True cost over time including subscription, infrastructure, integration, operations, and support.

Temperature (Sampling)

API
Beginner
A decoding parameter that controls randomness. Higher values increase diversity and creativity; lower values improve determinism and repeatability.

Related terms:

Top‑pDecodingHallucination

Tensor Cores

Performance
Advanced
Specialized GPU units that accelerate mixed-precision matrix operations (e.g., FP16/BF16) used heavily in deep learning.

Tensor Parallelism

Deployment
Advanced
Splitting a single model’s layers/weights across multiple devices to fit large models and increase compute throughput.

TensorFlow

Tools
Beginner
Google's open-source machine learning framework, known for production deployment and graph-based computation.

Related terms:

PyTorchDeep Learning FrameworkMachine Learning

TensorRT

Performance
Advanced
NVIDIA’s inference optimization SDK that compiles and optimizes neural networks for fast GPU execution.

Text-to-Image

Tools
Beginner
A type of generative AI that creates images from textual descriptions (prompts).

Text-to-Video

Tools
Intermediate
A type of generative AI that creates video clips from textual descriptions or image inputs.

Throughput (TPS)

Performance
Intermediate
The number of requests or tokens processed per second. Critical for scaling and cost efficiency in production systems.

Related terms:

LatencyBatching

Token

Fundamentals
Beginner
The basic unit of text that an AI model processes. A token can be a word, part of a word, or even a character. For example, 'ChatGPT' might be split into 'Chat' and 'GPT' as two tokens. Token count affects API costs, context limits, and processing time.

Token (in LLMs)

LLM
Beginner
In Large Language Models, text is often broken down into smaller units called tokens for processing. Tokens can be whole words, parts of words (subwords), or even individual characters and punctuation. Model context windows and pricing are often measured in tokens.

Token Pricing

Business
Beginner
The cost structure for using AI APIs, typically charged per 1,000 tokens (1K tokens). Pricing often differs between input tokens (prompt) and output tokens (completion). Understanding token pricing is crucial for budgeting AI applications.

Related terms:

Tokenization

Fundamentals
Intermediate
The process of splitting text into tokens (subwords/characters). Tokenization affects context limits, latency, and cost calculations.

Related terms:

Tokenizer

Fundamentals
Intermediate
A component that converts text into tokens (IDs) for a model and back into text. Tokenizer choice (BPE, SentencePiece, etc.) affects token counts, costs, and how well a model handles languages, whitespace, and rare words.

Tool Permissions

Safety
Advanced
Rules that define which tools an agent can call, with what arguments, and under which conditions. Strong tool permissions prevent unsafe actions and limit blast radius.

Tool Poisoning

Safety
Advanced
An attack where a tool’s output, documentation, or retrieved content is manipulated to steer an agent into unsafe actions (e.g., exfiltrating secrets, running harmful commands).

Top-k Sampling

API
Intermediate
Decoding strategy that considers only the k most likely tokens, balancing diversity and quality in text generation.

Related terms:

TemperatureNucleus SamplingText Generation

Top‑k

API
Intermediate
Sampling from the top k most probable tokens at each step to control randomness and quality.

Top‑k Retrieval

LLM
Intermediate
Selecting the k most similar passages to a query based on vector similarity or hybrid scoring.

Top‑p (Nucleus Sampling)

API
Intermediate
A decoding method that samples from the smallest set of tokens whose cumulative probability is at least p, balancing quality and diversity.

Related terms:

TemperatureDecoding

TPU (Tensor Processing Unit)

Performance
Intermediate
Google’s accelerator for matrix-heavy ML workloads. TPUs are commonly used for training and serving large models in Google’s ecosystem.

Training (AI Model)

Training
Intermediate
The process of exposing an AI model to a dataset, allowing it to learn patterns, relationships, and features from the data by adjusting its internal parameters (weights). The goal is to enable the model to perform a specific task accurately on new, unseen data.

Transfer Learning

Training
Advanced
A machine learning method where a model developed for a specific task is reused as the starting point for a model on a second, different but related task. This is the core principle behind using pre-trained foundation models and then fine-tuning them, saving significant time and computational resources.

Transformer Architecture

LLM
Advanced
A neural network architecture, introduced in the paper 'Attention Is All You Need,' that relies heavily on 'self-attention' mechanisms to process sequential data like text. It's the foundation for most modern Large Language Models (LLMs) due to its effectiveness in capturing long-range dependencies and contextual relationships.
0

Related terms:

Transformers

Fundamentals
Intermediate
Plural alias for the Transformer architecture used in modern LLMs. Transformers rely on attention mechanisms to model long‑range dependencies in sequences.

Tree of Thoughts

LLM
Advanced
Advanced prompting technique that explores multiple reasoning paths simultaneously, like a decision tree for complex problem-solving.

Tree of Thoughts (ToT)

LLM
Advanced
An advanced prompting technique where the model explores multiple reasoning paths simultaneously, like a tree search. Each branch represents a different approach to solving the problem. The model evaluates and selects the most promising paths, enabling more complex problem-solving.

Related terms:

Chain of ThoughtReasoningPrompt Engineering

TTFB (Time to First Byte)

Performance
Intermediate
Latency from request start until the first byte is received. Impacted by network, cold starts, and model prefill.

Related terms:

Latency (AI Systems)Prefill vs DecodeStreaming (Token Streaming)

TTI (Time to Interactive)

Performance
Advanced
Time until a page or app becomes reliably interactive for users. Optimize assets, hydration, and streaming.

Related terms:

TTFB (Time to First Byte)Streaming (Token Streaming)Observability

TTS (Text‑to‑Speech)

Tools
Beginner
Synthesis of natural‑sounding speech from text.

Related terms:

ASRSpeech SynthesisMultimodal AI

U-Net

Fundamentals
Advanced
A neural network architecture commonly used as the denoiser in diffusion models. It predicts noise (or related targets) at each step, conditioned on text or other signals.

Related terms:

Unsupervised Learning

Fundamentals
Intermediate
A type of Machine Learning where the model learns from unlabeled data, identifying hidden patterns, structures, or relationships within the data without predefined correct answers (e.g., clustering similar customers, anomaly detection). Learn more on Wikipedia.

VAD

Fundamentals
Intermediate
Alias for Voice Activity Detection.

VAD (Voice Activity Detection)

Fundamentals
Intermediate
Detecting speech segments in audio streams to segment or trigger recognition.

Related terms:

VAE (Variational Autoencoder)

Fundamentals
Advanced
A generative model that learns a probabilistic latent representation. In latent diffusion pipelines, a VAE encodes images into a latent space and decodes latents back into pixels.

Related terms:

Vanishing Gradient Problem

Fundamentals
Advanced
Issue in deep networks where gradients become extremely small, preventing effective weight updates in early layers.

Vector Database

LLM
Intermediate
A specialized database designed to efficiently store and query high-dimensional vectors, which are numerical representations of data like text, images, or audio (known as embeddings). They are essential for applications like semantic search, recommendation systems, and Retrieval Augmented Generation (RAG).

Vector Store

Fundamentals
Intermediate
Alias used in practice for a vector database used to store embeddings and run vector similarity search.

Related terms:

Vendor Lock‑in

Business
Intermediate
High switching costs due to proprietary APIs, models, or data formats. Mitigate with standards and abstraction layers.

Vision Transformer

Fundamentals
Advanced
Transformer architecture adapted for computer vision tasks by treating images as sequences of patches.

Vision-Language Model (VLM)

LLM
Intermediate
AI models that can process and understand both images and text, enabling tasks like image captioning, visual question answering, and multimodal reasoning. Examples include GPT-4V (Vision), Claude 3, and Gemini Pro Vision.

Vision-Language Models

LLM
Intermediate
AI models that understand and generate both visual content (images) and textual content (language).

vLLM

Performance
Advanced
An open-source LLM inference engine optimized for high throughput and efficient KV cache management. Commonly used for serving open-weight models.

VQA (Visual Question Answering)

Fundamentals
Advanced
Answering natural‑language questions about images or video.

VRAM

Performance
Intermediate
Video RAM on a GPU. VRAM capacity is a key constraint for model size, batch size, KV cache, and context length during inference.

Warm Path vs Cold Path

Performance
Advanced
Warm path serves cached/ready resources for low latency; cold path performs full initialization. Architect for warm hits and graceful cold behavior.

Related terms:

Cold StartCachingStreaming (Token Streaming)

Watermarking (AI Outputs)

Safety
Intermediate
Techniques to embed invisible markers in AI-generated content to prove authenticity and detect manipulation.

Related terms:

AI WatermarkingContent AuthenticityDeepfake Detection

Web Browsing

API
Intermediate
The act of navigating and interacting with web pages, often used in the context of web scraping or automated browsing.

Related terms:

WebhooksAPIsStreaming (Token Streaming)

Webhooks

API
Advanced
HTTP callbacks that deliver events to subscribers. Validate signatures, implement retries idempotently, and design for back‑pressure.

Weight Decay

Training
Intermediate
A regularization technique (often L2-like) that penalizes large weights during training. In AdamW, weight decay is applied decoupled from the gradient update.

Word Embeddings

Fundamentals
Intermediate
Dense vector representations of words that capture semantic meaning and relationships between words.

Related terms:

EmbeddingsNLPVector Space

Working Memory (Agents)

LLM
Intermediate
Short-lived context used for the current task (current goals, temporary variables). Often implemented as in-context state within the prompt.

Related terms:

Zero-shot Learning/Prompting

LLM
Intermediate
An AI model's ability to perform a task it hasn't been explicitly trained on, by leveraging its general knowledge and understanding the instructions provided in the prompt, without any specific examples of that task.

About this AI Glossary

A world‑class reference designed for engineers, product teams, and decision‑makers. Scan crisp definitions, follow related concepts, and jump straight to the details that matter—like LLM, RAG, and Transformers.

How to use it

  • Search fast: Find terms and synonyms instantly via the search box.
  • Connect ideas: Use related terms to traverse concepts and build intuition.
  • Go deeper: Follow references for primary sources and best-practice guides.

Popular AI terms

Quick access to the concepts teams ask about most:

Master the Language, Master the Tech

Now that you're familiar with the key terms, apply your knowledge by exploring our learning paths and discovering the tools that use this technology.

Explore All Learning Paths