Tag: Safety

09 Apr 2026
Least Privilege and Capability Containment Designing Agents That Cannot Exceed Their Mandate
How to apply the principle of least privilege to AI agents through tool allowlisting, permission tiers, and sandboxed execution environments that enforce safety at the architecture level.
30 Mar 2026
Indirect Prompt Injection How Untrusted Content Hijacks AI Agents
How indirect prompt injection attacks compromise AI agents, why agentic systems are uniquely vulnerable, and the defense patterns that actually work.
05 Mar 2026
Safe Reinforcement Learning Teaches Agents to Optimize Without Violating Constraints
How constrained MDPs, Lagrangian methods, and safety critics enable agents to maximize reward while staying within hard operational boundaries

Engineering Notes