Skip to content
Zitrino logo
AI Glossary

Enterprise AI Terms Defined.

Plain-English definitions of the terms that matter most in modern enterprise AI — from LLM orchestration and RAG to guardrails and agentic systems.

A

AI Agent

An autonomous software system powered by a large language model that can perceive inputs, reason over them, and execute multi-step actions — calling APIs, querying databases, writing code, or triggering workflows — to complete a goal without step-by-step human instruction.

AI Governance

The policies, controls, and monitoring systems that define what an AI system is permitted to do, who can access it, and how its decisions are audited. Enterprise AI governance covers role-based access, execution policies, audit trails, and environment separation across development and production.

AI Guardrails

Real-time constraints applied to AI model outputs to enforce safe, compliant, and on-brand behaviour. Guardrails scan outputs for PII leakage, toxicity, hallucinations, off-topic responses, and policy violations before content reaches end users — intercepting unsafe responses without blocking the underlying model.

AI Hallucination

When a large language model generates factually incorrect, fabricated, or unsupported content stated with apparent confidence. Hallucinations occur because LLMs predict plausible token sequences rather than retrieving verified facts. Enterprise systems mitigate hallucinations through retrieval-augmented generation (RAG), grounding checks, and output guardrails.

AI Observability

The practice of capturing, storing, and analysing every interaction in an AI system — prompts sent, model versions used, tool calls made, latencies, costs, and outputs generated — to enable debugging, auditing, cost management, and continuous quality improvement at scale.

Agentic AI

AI systems designed to operate with extended autonomy over multi-step tasks, making decisions and taking actions across tool calls, memory retrieval, and sub-agent coordination to complete complex workflows. Agentic AI goes beyond single-turn question answering into sustained, goal-directed execution.

Audit Trail

A tamper-evident, time-stamped log of every input, model call, tool invocation, and output in an AI system — associated with a user, session, and policy version. Audit trails are required for enterprise compliance, incident investigation, and demonstrating responsible AI use to regulators.

C

Chain of Thought

A prompting technique where a large language model is instructed to reason through a problem step by step before producing a final answer. Chain-of-thought prompting significantly improves performance on multi-step reasoning, arithmetic, and logic tasks by making intermediate reasoning explicit.

Context Window

The maximum number of tokens (words and characters) an LLM can process in a single interaction — spanning the system prompt, conversation history, retrieved documents, and the model's response. Larger context windows enable longer documents and richer conversational memory, but increase inference cost and latency.

E

Embeddings

Numerical vector representations of text (or other data) that capture semantic meaning. Similar concepts are placed close together in vector space, enabling semantic search, clustering, and retrieval. Embeddings are the foundation of RAG systems, where relevant documents are retrieved by similarity before being passed to an LLM.

Enterprise AI

AI systems designed and deployed for large-organisation use — characterised by requirements for governance, security, scalability, auditability, role-based access, integration with existing enterprise systems (ERP, CRM, ITSM), and compliance with industry regulations. Enterprise AI differs from consumer AI in its non-negotiable reliability and control requirements.

F

Few-shot Learning

A prompting approach where a small number of input-output examples are included in the prompt to guide an LLM's response format or reasoning style. Few-shot learning allows models to perform new tasks without retraining, leveraging in-context examples to steer output structure and quality.

Fine-tuning

Adapting a pre-trained foundation model to a specific domain, task, or style by continuing its training on a curated dataset. Fine-tuning adjusts model weights to improve performance on targeted use cases — such as legal document drafting or medical coding — while preserving the model's general capabilities.

Foundation Model

A large-scale AI model trained on broad datasets that serves as a general-purpose base for downstream tasks. Foundation models (GPT-4, Claude, Gemini, Llama) are adapted through prompting, fine-tuning, or RAG rather than trained from scratch for each application. They form the inference core of modern enterprise AI systems.

G

Grounding

The process of anchoring an LLM's responses to verified, external facts or documents rather than relying solely on the model's parametric knowledge. Grounding reduces hallucination by providing the model with retrieved source material and instructing it to base its answer on that material.

L

LLM (Large Language Model)

A deep neural network trained on massive text corpora to understand and generate human language. LLMs learn statistical patterns across billions of parameters, enabling them to summarise, translate, reason, code, and converse. Examples include GPT-4, Claude 3, Gemini 1.5, Llama 3, and Mistral Large.

LLM Orchestration

The coordination layer that routes requests across multiple LLMs, manages prompt construction, chains model calls, handles tool use, and aggregates outputs within a workflow. LLM orchestration decouples application logic from specific model providers, enabling model swapping, fallback routing, cost optimisation, and multi-step agent pipelines.

M

Model Routing

Dynamically directing AI requests to different models based on task type, cost threshold, latency requirement, or capability match. A routing layer might send simple summarisation tasks to a fast, cheap model and complex reasoning tasks to a more capable model — optimising cost and quality across a request mix.

Multi-Agent System

An architecture where multiple specialised AI agents collaborate on a complex task — each agent handling a subset of the work (research, writing, review, execution) and passing outputs to the next. Multi-agent systems increase parallelism and specialisation but require coordination, conflict resolution, and governance at the system level.

Multi-LLM

An architecture that integrates multiple large language models within a single platform or workflow, allowing different models to be invoked for different tasks based on cost, capability, or compliance requirements. Multi-LLM platforms give enterprises model flexibility and vendor independence rather than lock-in to a single provider.

P

PII Protection

Controls that detect and redact Personally Identifiable Information — names, email addresses, phone numbers, financial data, health records — from prompts sent to LLMs and from AI-generated outputs. PII protection is a mandatory guardrail in regulated industries to prevent sensitive data from being transmitted to external model APIs or logged in audit systems.

Prompt Engineering

The practice of designing and optimising natural language instructions given to an LLM to reliably produce desired outputs. Effective prompt engineering defines the model's persona, task scope, output format, and constraints. In enterprise deployments, system prompts encode governance rules, brand voice, and access boundaries for every interaction.

R

RAG (Retrieval-Augmented Generation)

An architecture that combines real-time document retrieval with LLM generation. When a query arrives, relevant documents are fetched from a vector database and injected into the model's context, grounding the response in current, organisation-specific knowledge rather than the model's training data alone. RAG reduces hallucination and enables AI to answer questions about private or up-to-date information.

RLHF (Reinforcement Learning from Human Feedback)

A training technique used to align LLM outputs with human preferences. Human annotators rate model responses, and those preferences are used to train a reward model that guides further fine-tuning via reinforcement learning. RLHF is a key technique behind the instruction-following and safety behaviours of modern models like ChatGPT and Claude.

S
T

Token

The basic unit of text processed by an LLM — roughly 0.75 words or 4 characters in English. LLMs read and generate text token by token. Tokens determine cost (most model APIs charge per token), context window limits, and generation speed. Understanding tokenisation is essential for prompt budgeting in enterprise deployments.

V

Vector Database

A database optimised for storing and querying high-dimensional embedding vectors. Vector databases enable fast approximate nearest-neighbour search — finding the most semantically similar documents to a query in milliseconds. Examples include Pinecone, Weaviate, Qdrant, and pgvector. They are the retrieval backbone of every RAG system.

Z

Zero-shot Learning

Asking an LLM to perform a task it has not been shown any examples of, relying entirely on the model's pre-trained knowledge and the task description in the prompt. Zero-shot prompting is the simplest approach but often benefits from few-shot examples or chain-of-thought instructions for complex tasks.