Engineering Notes

Engineering Notes

Thoughts and Ideas on AI by Muthukrishnan
13 Mar 2026

Opponent Modeling and Theory of Mind in Multi-Agent Systems

How agents reason about other agents' beliefs, goals, and strategies — from k-level thinking to neural Theory of Mind and LLM-based recursive reasoning
12 Mar 2026

Curriculum Learning and Automatic Curriculum Generation in Reinforcement Learning

How agents learn faster and more robustly by training on the right task at the right time — from hand-crafted curricula to adversarial environment generation
11 Mar 2026

Mean Field Games and Mean Field Reinforcement Learning for Massive Agent Populations

How mean field theory lets you solve game-theoretic problems with millions of agents by replacing individual interactions with a statistical summary of the crowd
10 Mar 2026

Counterfactual Regret Minimization for Solving Imperfect Information Games

How CFR and its variants taught AI to master poker and navigate the fog of war in multi-agent decision making
09 Mar 2026

Maximum Entropy Reinforcement Learning and the Soft Actor-Critic Algorithm

Understand how entropy maximization leads to more robust, exploration-efficient agents and underpins the industry-standard Soft Actor-Critic algorithm
07 Mar 2026

Cooperative Game Theory and Shapley Values for Fair Credit Assignment in Multi-Agent Systems

Learn how cooperative game theory and Shapley values provide a mathematically principled way to assign credit among collaborating agents, with practical Python implementations and connections to modern LLM agent teams.
06 Mar 2026

Reward Machines and Automata-Based Task Specification for AI Agents

How to specify complex, multi-step tasks for AI agents using finite-state automata called reward machines, enabling non-Markovian rewards and compositional task structure
05 Mar 2026

Safe Reinforcement Learning Teaches Agents to Optimize Without Violating Constraints

How constrained MDPs, Lagrangian methods, and safety critics enable agents to maximize reward while staying within hard operational boundaries
04 Mar 2026

Model-Based Reinforcement Learning How Agents Simulate Experience to Learn Faster

Explore how agents that build internal environment models can plan, simulate, and learn orders of magnitude faster than model-free approaches
03 Mar 2026

Successor Representations a Map of Where You Are Going

Understand successor representations — the elegant middle ground between model-free and model-based RL that enables fast adaptation and transfer across tasks.