AI Sec Weekly: Friday, May 22, 2026

Friday digest. Three top stories, one regulatory item, one technical item, the reading list, corrections — the usual structure. Everything below is framed as a durable, verifiable class or framework rather than a breaking incident; verify any CVE, version, or date against the primary source before acting.

Top three

1. The dangerous agent surface is SSRF and tool-use, not the chat box. The throughline I keep coming back to: as agents gained tools, the risk moved from “what the model says” to “what the model can do.” Give an agent a “fetch this URL” or “browse” tool whose target is influenced by model output — which is influenced by untrusted input — and you’ve built a classic server-side-request-forgery surface. The agent can be steered to hit internal services or cloud metadata endpoints. This is an architecture class, not a single CVE. The control set is unglamorous and it works: allowlist outbound destinations for any model-triggered fetch, block internal and link-local ranges, and put a deterministic authorization check between model output and any consequential action. Red-team it by delivering the payload through a tool’s data — a fetched page, a returned record — not through the chat input.

2. The model supply chain is your widest under-watched surface — and safetensors closes the worst of it. You ship weights and data, not just code, which makes the AI supply chain broader than ordinary software’s. The highest-leverage, lowest-effort fix this week concerns how you load models: legacy object-serialization formats can reconstruct arbitrary objects on load, so loading an untrusted checkpoint in such a format is effectively running untrusted code. The safetensors format stores tensors and metadata only, with no code-execution path on load, removing that class by construction. If your pipeline still loads untrusted legacy-serialized checkpoints in a privileged context — and many do — that’s the thing to fix before anything else. In the PyTorch ecosystem, weights_only=True and a preference for safetensors are concrete, generally-correct defaults. Provenance still matters separately; this just closes the “I loaded a model and it ran code” door.

3. Model extraction is a business risk people keep underweighting. Less glamorous than jailbreaks, more directly costly: an adversary systematically queries a deployed model to reconstruct a functional approximation, or to recover the proprietary scaffolding behind it. It’s a documented technique area in ATLAS ↗ and it’s an IP-exposure problem as much as a security one. Durable mitigations: rate-limit and monitor for extraction-pattern query volume, don’t return raw logits/probabilities you don’t need to, and treat the system prompt and proprietary scaffolding as recoverable rather than secret. The recurring theme connecting this to story #1 — assume the attacker reaches what the system can reach, and design the blast radius down.

Regulatory item

NIST AI RMF is the control map worth standardizing on. The NIST AI Risk Management Framework ↗ and its Generative AI Profile give security teams a Govern/Map/Measure/Manage structure that leadership and auditors already accept, and the GenAI Profile enumerates generative-specific risks (confabulation, data leakage, information-integrity) with suggested actions. The action: map your existing LLM controls — input/output filtering, red-team coverage, logging — onto the Measure and Manage functions. The gap that surfaces is almost always measurement (you can’t show effectiveness), not the absence of controls. Voluntary and US-origin, but widely used as a structuring tool well beyond the US.

Technical item

Prefer safetensors and weights_only=True as a default, today. Restated from the top three because it’s the single most actionable technical item of the week: for any model-loading code path that can touch a third-party artifact, the safe defaults are concrete and free. Inventory where your deployment loads weights from an open hub, confirm the format and load path, and add provenance verification. This is the kind of fix that’s boring on a normal week and the headline on a bad one.

The reading list

One line each — who, what, why bother.

MITRE ATLAS ↗ — adversarial technique knowledge base, including model-extraction and supply-chain technique areas. Why bother: the catalog behind two of this week’s top stories.
OWASP LLM Top 10 ↗ — re-read the excessive-agency and insecure-tool-use entries. Why bother: directly maps to the SSRF/tool-use story.
NIST AI RMF ↗ — Govern/Map/Measure/Manage plus the GenAI Profile. Why bother: the governance wrapper for everything above.
NIST AI 600-1, Generative AI Profile ↗ — generative-specific risks and actions. Why bother: the most concrete official mapping of GenAI risk to controls.

Corrections

Nothing to correct from the prior digest. Errors get fixed here with a visible note, never quietly amended.

Tips — vendor advisories, regulator actions, CVE pre-disclosures — reply to the digest. Sources protected.

— Theo

AI Sec Weekly: Friday, May 22, 2026

Top three

Regulatory item

Technical item

The reading list

Corrections

Sources

AI Sec Weekly — in your inbox

Related

AI Sec Weekly: Friday, May 15, 2026

The LiteLLM SQL Injection (CVE-2026-42208) and Why AI Gateways Are Crown Jewels

The OWASP LLM Top 10 (2025) Changed More Than the Numbering

Comments