Engineering Notes

Engineering Notes

Thoughts and Ideas on AI by Muthukrishnan
17 Mar 2026

Anytime Heuristic Search How Agents Find Good Plans Fast and Improve Them Over Time

Anytime heuristic search, including Weighted A* and ARA*, lets agents commit to a suboptimal plan immediately and refine it as time allows, with provable bounds on solution quality at every step.
16 Mar 2026

Classical Planning with STRIPS and the Delete Relaxation Heuristic

How STRIPS formalizes agent planning problems and how the delete relaxation trick produces powerful heuristics that guide search toward goals efficiently
15 Mar 2026

Working Memory Compression and Context Distillation in Long Horizon Agents

How long-running agents compress, distill, and selectively retain working memory to operate effectively within finite context windows
13 Mar 2026

Opponent Modeling and Theory of Mind in Multi-Agent Systems

How agents reason about other agents' beliefs, goals, and strategies, from k-level thinking to neural Theory of Mind and LLM-based recursive reasoning
12 Mar 2026

Curriculum Learning and Automatic Curriculum Generation in Reinforcement Learning

How agents learn faster and more robustly by training on the right task at the right time — from hand-crafted curricula to adversarial environment generation
11 Mar 2026

Mean Field Games and Mean Field Reinforcement Learning for Massive Agent Populations

How mean field theory lets you solve game-theoretic problems with millions of agents by replacing individual interactions with a statistical summary of the crowd
10 Mar 2026

Counterfactual Regret Minimization for Solving Imperfect Information Games

How CFR and its variants taught AI to master poker and navigate the fog of war in multi-agent decision making
09 Mar 2026

Maximum Entropy Reinforcement Learning and the Soft Actor-Critic Algorithm

Understand how entropy maximization leads to more robust, exploration-efficient agents and underpins the industry-standard Soft Actor-Critic algorithm
07 Mar 2026

Cooperative Game Theory and Shapley Values for Fair Credit Assignment in Multi-Agent Systems

Learn how cooperative game theory and Shapley values provide a mathematically principled way to assign credit among collaborating agents, with practical Python implementations and connections to modern LLM agent teams.
06 Mar 2026

Reward Machines and Automata-Based Task Specification for AI Agents

How to specify complex, multi-step tasks for AI agents using finite-state automata called reward machines, enabling non-Markovian rewards and compositional task structure