Deconstructing AI
A No-Nonsense Guide to the Terminology Explosion
Welcome. If you feel overwhelmed by the daily avalanche of AI buzzwords—Agents, RAG, MCP, Function Calling—you are in the right place. Today, we are going to strip away the marketing hype and look at the "undergarments" of these intimidating concepts.
You will discover that many of these "revolutionary" ideas are often just "old wine in new bottles." Ultimately, an "Agent" is often just the parts of a workflow that don't require intelligence, wrapped around a model that does.
Clear your mind, forget what you think you know, and let's rebuild the AI landscape from first principles.
Large Language Model (LLM)
The chaos begins here. Decades ago, language models were simplistic—essentially statistically primitive. However, as researchers increased the parameter size (the number of variables in the model), a critical threshold was crossed. Intelligence seemed to "emerge." To distinguish these powerful giants from their simplistic ancestors, we added "Large" to the name.
How LLMs Predict the Next Token
Prompt & Context
To make this "autocomplete engine" useful, we assign roles. Imagine you are the Boss, and the LLM is your employee.
- Prompt: This is the specific instruction or query you give the employee.
- Context: This is the background information the employee needs to know to answer the prompt.
For example, instead of just saying "Write an email" (Prompt), you might provide: "You are a polite customer support agent. The customer is angry about a late delivery" (Context). By explicitly separating instructions from background data, you guide the probability engine to a specific, useful outcome.
Memory
Here is the catch: The basic LLM has no brain to store your conversation. It answers one question and immediately forgets you exist.
To create the illusion of a continuous conversation, engineers developed a trick. Before you ask your second question, the system secretly pastes your first question and the model's first answer back into the Context.
Agent
Eventually, you realize the LLM has a flaw: it is trapped in a box. It cannot access the internet, check the weather, or query your database. It hallucinates answers because it relies only on its training data.
To fix this, we wrap the LLM in a program. If the LLM doesn't know an answer, we tell it: "Ask for help." The wrapper program then performs the action (like a Google search) and feeds the result back to the LLM.
Agent = LLM + Tools + Memory Loop
Retrieval-Augmented Generation (RAG)
Giving an Agent access to the entire internet is messy. Sometimes you want it to access your private company data.
- The Problem: You can't paste 10,000 PDF pages into the prompt (it's too expensive and exceeds limits).
- The Solution: You use a Vector Database. This converts your text into numbers (vectors). When you ask a question, the database finds the text snippets that are mathematically similar to your query.
- The Acronym: This process—retrieving relevant data and inserting it into the context to guide the answer—is called Retrieval-Augmented Generation (RAG).
Think of it as an "Open Book Exam." The model doesn't need to memorize the textbook; it just needs to know how to look up the relevant page before answering.
RAG Pipeline — The "Open Book Exam"
Function Calling
When an Agent needs to use a tool (like a calculator or a calendar API), relying on natural language is risky. If the model says, "I think I should check the calendar," a computer program cannot execute that sentence.
The Solution: We force the model to output a strict data format, usually JSON. This capability is Function Calling. It is simply a protocol (agreement) that allows the vague, artistic brain of an LLM to interface with the rigid, logical world of software code.
Function Calling — Bridging Language & Code
"Schedule a meeting tomorrow at 2 PM"
{
"function": "calendar_add",
"date": "2026-02-11",
"time": "14:00"
}
calendar_add() executed successfullyModel Context Protocol (MCP)
As we build more tools, we run into a scaling problem. How does the Agent know which tools are available? How does it connect to a Google Drive tool versus a Slack tool?
The Solution: MCP (Model Context Protocol). Think of MCP as a "USB standard" for AI tools. It is a universal specification that defines how an AI connects to data sources and tools.
- The LLM is the brain.
- The MCP Server provides the tools.
- The MCP Client (the Agent) acts as the bridge.
Instead of hard-coding every integration, the Agent asks the MCP server: "What tools do you have?" and the server replies with a list. It standardizes the connection.
MCP — The "USB Standard" for AI Tools
LangChain & Workflow
Let's look at automating complex tasks. Suppose you want to scrape a competitor's website, summarize their pricing, and save it to a spreadsheet.
You have two main ways to build this:
- LangChain: A code-heavy framework. It chains steps together programmatically. It is powerful but rigid—if the website structure changes, the code might break.
- Workflow: The low-code version. You drag and drop blocks on a canvas (e.g., "Input" → "Summarize" → "Save").
if/else logic for every scenario.Skill
To solve the rigidity of Workflows, we have the concept of a Skill. A Skill is essentially a directory containing a prompt file (often skill.md or similar) and some scripts.
- How it works: Instead of hard-coding the steps, you describe the capability in plain English in the text file. You tell the Agent: "Read this file to understand how to perform the task."
- The Benefit: It bridges the gap between total freedom (unreliable) and rigid coding (inflexible). It allows the Agent to dynamically decide when to use the script based on the instructions.
Sub-Agent
For massive tasks, one Agent gets confused. The context becomes too long and "noisy."
The Solution: Break the task down.
- Master Agent: "I need to build a software app."
- Sub-Agent A (Coder): Writes the code.
- Sub-Agent B (Tester): Reviews the code.
A Sub-Agent is just a separate instance of an LLM with a specific prompt and a clean context history, dedicated to a niche task. It prevents "context pollution."
Multi-Agent System
A Sub-Agent handles one piece of work. But what happens when the entire system is designed around multiple agents working together from the start? That is a Multi-Agent System (MAS).
Instead of one brain trying to do everything, you architect a team:
- Orchestrator: A central planner that breaks the overall goal into tasks and assigns them to specialized agents.
- Specialized Agents: Each agent has its own LLM instance, system prompt, tool set, and memory—optimized for one role (research, coding, QA, writing, etc.).
- Communication Protocol: Agents pass messages through a shared bus, structured handoffs, or direct function calls. The format of these messages is critical—without a protocol the system devolves into chaos.
Real-world examples include Microsoft's AutoGen, CrewAI, and LangGraph—all frameworks that let you define agent teams, their roles, and how they collaborate.
Multi-Agent System — Specialized Agents Collaborating
Agentic AI
Now here is where confusion peaks. People use "Agentic AI" and "Multi-Agent System" interchangeably. They are not the same thing.
Agentic AI is a behavioral paradigm, not an architecture. It describes any AI system that exhibits autonomous, goal-directed behavior—the ability to independently perceive its environment, make plans, take actions, observe results, and adjust course without human hand-holding at every step.
Multi-Agent System = "How many brains, and how are they organized?" (architecture)
Agentic AI = "Does the AI act autonomously toward a goal?" (behavior)
A single agent can be "Agentic" if it runs an autonomous loop. A Multi-Agent System is often Agentic, but not necessarily—you could have a rigid multi-agent pipeline with zero autonomy.
The defining feature of Agentic AI is the autonomous loop:
- Perceive: Observe the current state of the world (read files, check API responses, parse errors).
- Plan: Decide what to do next based on the goal and current state.
- Act: Execute the plan (write code, call tools, send messages).
- Reflect: Evaluate the result. Did it work? What went wrong? Adjust the plan.
- Repeat until the goal is satisfied or a termination condition is hit.
This loop is what makes AI "agentic." Without it, you just have a chatbot that responds to one prompt at a time. With it, you have a system that can tackle open-ended tasks like "debug this codebase" or "write a research paper"—iterating until the job is done.
Why this matters practically: When someone says "we built an Agentic AI," they mean the system can reason, plan, and act in a loop. When someone says "we built a Multi-Agent System," they mean the system has multiple specialized AI workers. The best modern systems—like Devin, OpenAI's Operator, or Claude's computer use—are both: agentic behavior implemented through a multi-agent architecture.
Agentic AI vs. Multi-Agent System — Key Differences
| Dimension | Agentic AI | Multi-Agent System |
|---|---|---|
| Core Unit | Single autonomous agent | Multiple collaborating agents |
| Decision | Self-directed planning loop | Distributed / delegated |
| Architecture | Perceive → Plan → Act → Reflect | Orchestrator → Agents → Bus |
| When to Use | Open-ended, evolving goals | Complex, parallelizable tasks |
| Risk | Unpredictable actions | Coordination overhead |
The Agentic Loop — Perceive · Plan · Act · Reflect
A Unified Methodology: The Spectrum of Control
We can categorize all these terms onto a spectrum of Stability vs. Flexibility:
- Hard Code (LangChain / Code): Maximum Stability, Minimum Flexibility. You define every step. Good for repetitive, identical tasks.
- Workflow (Low-Code): High Stability, Low Flexibility. Easier to visualize, but still a rigid pipeline.
- Skill (Hybrid): Balanced. You provide the tools and a manual (prompt), but let the AI decide exactly how to execute the steps.
- Pure Agent (Autonomous): Minimum Stability, Maximum Flexibility. You give a goal and the Agent writes its own scripts and plans its own path.
The Spectrum of Control: Stability vs. Flexibility
The Future
Currently, we use Workflows and Skills because LLMs are expensive and prone to error. We need to constrain them. However, as "Token" costs drop to near zero and models become smarter, we will shift toward Pure Agents.
Just as software development moved from Assembly to Python/Spring Boot to maximize developer convenience (ignoring the massive increase in computing power required), AI interaction will move toward maximum user convenience.
We are heading toward a "Super Agent" future—where concepts like MCP, RAG, and Skills are hidden "implementation details." You won't configure a "Skill"; you will just speak, and the system will intuitively understand which tool to wield.
If this breakdown helped clarify the fog of AI terminology, please consider sharing this post.