- The New Execution Layer: What AI Agents Actually Do in 2026
- Three Attack Vectors That Didn't Exist Two Years Ago
- OWASP Top 10 for LLM Applications: What the Framework Actually Says
- Why Your Traditional SOC Contract Has a Blind Spot
- The Visibility Problem: You Can't Secure What You Can't See
- The Question Every Security Leader Should Be Asking in 2026
1. The New Execution Layer: What AI Agents Actually Do in 2026
There's a category error in how most organizations think about AI risk. They imagine a chatbot — something that answers questions and produces text. The attack surface of a chatbot is relatively narrow: you can ask it inappropriate things, and it might produce inappropriate output. That's a content moderation problem, not a security architecture problem.
The AI systems being deployed in 2026 are not chatbots. They are autonomous execution environments — software agents that use language models as their reasoning engine but whose real impact comes from what they do, not what they say. The distinction matters enormously for security.
Here is a non-exhaustive list of what production AI agents are doing in organizations right now:
Claude Code (Anthropic's CLI tool launched in 2025) is a real-world example: it reads your entire codebase, writes new files, runs shell commands, executes tests, makes git commits, and manages pull requests. It operates with the same filesystem permissions as the user running it. An attacker who can influence Claude Code's behavior can do anything that user can do.
The Model Context Protocol (MCP), released by Anthropic in November 2024 and rapidly adopted across the industry, goes further. MCP is an open standard that lets AI models connect to arbitrary external data sources and tools — databases, APIs, file systems, calendars, internal services. As of 2026, over 5,000 MCP-compatible integrations exist in the public ecosystem. Each integration is a new channel through which an AI model interacts with real systems.
The shift in one sentence: Previously, an attacker who wanted to modify your database needed to find a SQL injection vulnerability or steal credentials. In 2026, they might just need to inject a malicious instruction into content that your AI agent will process.
2. Three Attack Vectors That Didn't Exist Two Years Ago
The security research community identified AI-specific attack vectors well before enterprise AI agent adoption reached its current scale. These are not theoretical — they have been demonstrated against real AI systems in academic research and in the wild. Here are the three most significant vectors.
Prompt Injection
An attacker embeds malicious instructions in data that an AI agent will process. The model, unable to reliably distinguish between the instructions given by its legitimate operator and injected instructions, follows the attacker's commands instead.
Direct prompt injection occurs when a user directly inputs adversarial instructions. Indirect prompt injection — arguably more dangerous — occurs when the malicious instructions are embedded in external content the agent retrieves: a webpage it visits, an email it reads, a document it summarizes, a database record it looks up.
Agent Hijacking via Excessive Agency
Once an attacker has influenced an agent's behavior — through prompt injection or by compromising a tool the agent trusts — they can redirect the agent's entire execution path. The hijacked agent continues operating from inside the organization's trust boundary, using real credentials and real access, but working toward the attacker's goals.
Excessive Agency (OWASP LLM06) describes AI agents granted more capability than their task requires: write access when read-only would suffice, API permissions beyond what any single task needs, the ability to spawn and direct other AI agents without human review. The more an agent can do, the more valuable it is as a hijacking target.
MCP Server Compromise
The Model Context Protocol creates a mediation layer between AI models and real-world systems. An MCP server tells the AI model what tools are available, handles tool call requests, and returns results. If an MCP server is compromised — or if a malicious MCP server is introduced — the AI model trusts every response it receives.
From a network perspective, a compromised MCP server looks identical to a legitimate one: same endpoints, same authentication headers, same response format. The difference is in the content of the responses — which may include injected instructions, falsified data, or tool results that redirect agent behavior. This attack requires no malware, no CVE exploitation, and triggers no traditional network-layer alerts.
3. OWASP Top 10 for LLM Applications: What the Framework Actually Says
OWASP published the Top 10 for Large Language Model Applications in 2023 (v1.0) and updated it in 2025 (v2.0). It is the most widely cited framework for AI application security risk. The table below shows the most enterprise-relevant risks and how they map to traditional security tooling.
| OWASP ID | Risk Category | Traditional SOC Coverage |
|---|---|---|
| LLM01:2025 | Prompt Injection | None |
| LLM02:2025 | Sensitive Information Disclosure | Partial (DLP) |
| LLM03:2025 | Supply Chain | Partial (SCA) |
| LLM04:2025 | Data and Model Poisoning | None |
| LLM06:2025 | Excessive Agency | None |
| LLM07:2025 | System Prompt Leakage | None |
| LLM08:2025 | Vector and Embedding Weaknesses | None |
| LLM09:2025 | Misinformation | None |
| LLM10:2025 | Unbounded Consumption | Partial (rate limits) |
The coverage column reflects a structural reality: traditional security tools were designed to detect attack signatures in network packets, system calls, and process behavior. Prompt injection arrives as a string of natural language text in an API request body. It passes through WAF rules. It appears in SIEM logs as a normal application API call with HTTP 200 responses. Endpoint detection doesn't flag it. The attack exists in a layer that traditional tooling was never built to inspect.
LLM01:2025 — Prompt Injection in depth
OWASP describes two forms of prompt injection. The first — direct injection — is when an attacker sends adversarial instructions directly to the model. This is the form most people picture, and it's relatively constrained in enterprise deployments where user access to the model is controlled.
The second — indirect injection — is dramatically more dangerous for enterprise AI agents. The attacker doesn't need access to the AI system at all. They need access to any external content source the agent processes: a webpage, a document in a shared drive, a CRM note, a support ticket, an email. The injection rides in data, not in a direct conversation, and the agent processes it exactly as it would process legitimate content because from the model's perspective, text in context is text.
LLM06:2025 — Excessive Agency
Every security architect knows the principle of least privilege: give a service account only the permissions it needs for its specific task. This principle is almost entirely absent from current AI agent deployments. Agents are given broad file system access "because it's easier." They're given production database credentials "because read-only was too slow to set up." Each permission granted is an additional capability available to any attacker who successfully hijacks the agent.
The least-privilege failure for AI agents: When a developer sets up a traditional script's service account, they carefully scope permissions. When they set up an AI agent, they often give it their own credentials or a broad service account — because scoping AI agent permissions requires understanding exactly what the agent will do, which is hard to predict. This gap is the Excessive Agency vulnerability at scale.
4. Why Your Traditional SOC Contract Has a Blind Spot
A traditional SOC monitors a defined set of data sources and looks for a defined set of indicators of compromise. The scope was designed around a threat model that assumes attackers come from outside: they breach the perimeter, move laterally, escalate privileges, and exfiltrate data. Detection signatures were built to catch each phase of this chain.
The critical point is not that traditional SOC tooling is bad — it's that it was designed for a different threat model. When an attacker exploits prompt injection against your enterprise AI deployment, the sequence looks like this from the SOC's perspective:
- Normal API request from the AI application to an LLM provider — HTTP 200, expected latency
- Normal API call from the agent to an internal service — authenticated, expected endpoint
- Normal file write operation — agent has the permission, extension matches expected output
- Normal outbound API call — authenticated, to a known endpoint, within normal call frequency
Every individual action is indistinguishable from legitimate behavior. No CVE is triggered. No malware signature fires. No authentication anomaly occurs. The attacker never touches the network directly. The attack is invisible to traditional SOC monitoring because the entire attack surface exists inside the AI layer — a layer the SOC was not contracted to watch.
5. The Visibility Problem: You Can't Secure What You Can't See
Here is the practical question that gets to the heart of AI agent security: Do you know what your AI agents can reach?
Not in abstract terms — not "they have database access and API access." Specifically: which endpoints, which services, which data stores, what permissions. And for each of those: what vulnerabilities exist, what authentication weaknesses are present, what's exposed to the public internet that perhaps shouldn't be?
Most organizations deploying AI agents can't answer this question precisely. The deployment cycle was fast. The agent was given existing service account credentials. Whoever set it up wasn't a security architect — they were a developer trying to ship a feature. The agent's attack surface is roughly "everything the service account can touch," and nobody mapped that surface before the agent was deployed.
This creates a specific and exploitable condition: the AI agent is traversing an attack surface that is unknown even to the organization that deployed it.
Why Attack Surface Visibility Is the Starting Point
Consider the attack chain for indirect prompt injection via a compromised external service:
- Attacker identifies that your AI agent regularly retrieves data from a third-party API endpoint
- Attacker discovers that the endpoint has a misconfiguration — perhaps an API key is exposed in a JavaScript file indexed by Shodan, or the endpoint accepts unauthenticated requests from certain IP ranges
- Attacker gains the ability to inject content into the data that endpoint returns
- Injected content contains instructions that redirect the agent's behavior
- Agent executes attacker-directed actions using its legitimate credentials
Step 2 is where baseline attack surface scanning matters. If you know that your API endpoint has an exposed key or accepts unauthenticated requests, you can fix it before an attacker can exploit it. In 2026, automated scanning tools run against the entire internet continuously. The question isn't whether an attacker will scan your endpoints — it's whether you scanned them first.
Every misconfiguration in your external attack surface — every exposed API endpoint, every missing authentication header, every subdomain pointing to a deprecated service — is not just a risk to a human attacker. It's a risk amplified by the fact that a trusted AI agent may traverse it, and whatever an attacker can inject into that surface, the agent will process as instructions.
What Baseline Scanning Gives You
External attack surface scanning — the systematic, automated inventory and vulnerability assessment of all externally reachable services associated with your organization — gives you a map of what your AI agents (and your attackers) can see. That includes:
- All public-facing API endpoints and their authentication status
- Exposed credentials in JavaScript files, DNS records, or public repositories
- Subdomains pointing to abandoned or misconfigured services
- SSL/TLS configurations that enable interception
- Third-party services accessible from your environment with weak access controls
- Server version information that maps to known CVEs
This map is the prerequisite for AI agent security. You cannot scope an agent's permissions intelligently without knowing what it can reach. You cannot defend an attack surface you haven't mapped.
6. The Question Every Security Leader Should Be Asking in 2026
The ransomware era taught most organizations a painful lesson about assuming they were too small to be targeted. The AI agent era is teaching a parallel lesson about assuming that "AI security" means content moderation and model safety.
It doesn't. AI agent security is infrastructure security with a new execution layer added on top. The new layer has attack vectors that weren't in last year's threat model, that your current SOC contract doesn't cover, and that your existing tooling can't detect.
- What can our AI agents access, and have we applied least-privilege principles to those accesses?
- Do we have a way to detect anomalous agent behavior — not just anomalous network behavior?
- Have we reviewed our external attack surface with the understanding that AI agents now traverse it?
- If an agent was successfully prompt-injected today, what's the worst it could do with its current permissions?
- Does our incident response plan include AI agent compromise as a scenario?
If an attacker successfully injected instructions into content that your AI agent will process today — a webpage it visits, a document it summarizes, a database record it reads — what is the maximum impact? That impact is your current AI agent risk exposure. If you can't answer the question, you can't manage the risk.
The first step toward answering it is knowing what your agents can reach. That requires a baseline. Without it, you're securing an organization whose AI agents are operating in territory you've never mapped.
- OWASP, "OWASP Top 10 for Large Language Model Applications v2.0" (2025) — LLM01:2025 Prompt Injection, LLM06:2025 Excessive Agency, full risk taxonomy
- IBM Security, "Cost of a Data Breach Report 2024" — $4.88M average total cost of a data breach globally
- Anthropic, "Model Context Protocol" (November 2024) — Open standard for AI-to-tool connectivity; ecosystem growth data
- Anthropic, "Claude Code" (2025) — Agentic CLI tool with full filesystem and shell access; capability documentation
- Greshake et al., "Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection" (arXiv, 2023) — Foundational research on indirect prompt injection
- NIST, "Artificial Intelligence Risk Management Framework (AI RMF 1.0)" (2023) — Federal AI governance baseline
- Shodan.io, Censys.io — Internet-wide scanning services that index exposed endpoints, credentials, and service fingerprints
Know What Your AI Agents Can Reach
ADCS continuously maps your external attack surface — the same surface your AI agents traverse. Start with a free baseline assessment to see what's exposed before someone else does.
Questions? Email info@avisail.com · Response within 1 business day