From guidance to wiring diagrams.
The joint guidance is principle-led and technology-agnostic, which is exactly why executives and architects need translation. The patterns below are concrete starting points that satisfy the guidance while staying recognisable to existing engineering teams.
Agentic system anatomy
Before any pattern, agree on the components. An agentic system has, at minimum: an LLM reasoning core; a planning workflow that may spawn sub-agents; external tools the system can call; external data sources that flow into context; and short- and long-term memory. Every box is a potential attack surface.
Source: Careful adoption of agentic AI services, co-authored by ASD's ACSC, CISA, NSA, Canadian Cyber Centre, NCSC-NZ, NCSC-UK (Figure 1, p. 5).
Zero-trust agent runtime
Three planes: a control plane that decides who/what/when; a data-execution plane where the agent actually runs in a sandboxed enclave; and a resource plane of approved tools and data. Every inter-plane call is authenticated, authorised, and logged.
Identity & policy
- Trusted agent registry — DIDs / mTLS certs, role bindings.
- Centralised policy decision point — per-request allow/deny.
- Just-in-time secrets — ephemeral, scoped, expiring.
- Code attestation — agent must prove unmodified code.
Agent in a sandbox
- Input manager: prompt-injection filtering, trust-tier on context.
- Single-principal agent with narrow tool allow-list.
- Output validator: schema, grounding, redundant-agent cross-check, DLP on egress.
- HITL gate for high-impact actions.
Approved tools & data
- Allow-list of pinned API versions.
- RAG corpus with provenance metadata.
- Read-only by default; writes via signed commands.
- Sub-agent invocation gated by the policy decision point.
- Use OAuth2 token exchange + short-lived workload identity (e.g. SPIFFE/SPIRE) for the agent principal; never share long-lived credentials across agents.
- Make every tool call go through an outbound proxy that enforces the allow-list and rate limits — the agent never reaches the public internet directly.
- Run the agent process in a hardened sandbox (gVisor, Firecracker microVM, or equivalent) with seccomp + read-only root.
- Log every prompt, tool call, model output to an append-only ledger separate from the agent's write scope.
Multi-agent isolation with mediated handoffs
Implicit trust between agents is the most common cause of cascading agentic failures. Replace it with mediation: separation of duties (Orchestrator / Reader / Actuator), a policy validator, and a secondary reviewer agent that has to co-sign actions. High-risk agents live in their own enclave with a HITL gate.
When to apply
- You have more than one agent collaborating on a task.
- Any agent can write to systems of record (CRM, ERP, email, code repos).
- Compromise of one agent could pivot through trusted handoffs.
Anti-patterns to avoid
- Agents sharing credentials or tokens.
- Tool descriptions with persuasive language ranked by an agent during selection.
- Sub-agent spawning without expiry timers or recorded grant chains.
- Trusting another agent's output without schema validation or provenance.
HITL workflow sized to reversibility
Not every action needs a human signature. Classify actions by impact, likelihood, and reversibility; set the gate accordingly. The agent does not get to decide which class an action falls into.
Low risk · reversible
Read-only retrieval, draft generation, internal summarisation. Execute and log.
Medium
Internal write actions, automated outreach, workflow updates. Require multi-agent consensus before execution; log the consensus chain.
High · irreversible
External payments, deletions, sensitive customer comms, network egress, system resets. Human approval — always — plus cryptographic signing on the resulting command.
Defence in depth, agent-shaped
Layer controls so that no single mechanism is load-bearing. Each layer below should fail before the next is breached. Treat this list as a checklist for any agent that touches production data.