Engineering Notes

Engineering Notes

Thoughts and Ideas on AI by Muthukrishnan
02 Mar 2026

Goal-Conditioned Reinforcement Learning and Hindsight Experience Replay Turn Failures Into Training Opportunities

Learn how goal-conditioned RL and Hindsight Experience Replay allow agents to master hard tasks with sparse rewards by treating every failure as a lesson toward a different goal.
02 Mar 2026

Spec-Driven Development with spec-kit: Stop Vibe Coding, Start Specifying

A complete tutorial on using GitHub's spec-kit to bring structure to AI-assisted development — from install to your first specification.
01 Mar 2026

RLHF and Preference Learning Teaching Agents What Humans Actually Want

Master reinforcement learning from human feedback — the algorithm behind ChatGPT and modern aligned agents — from reward modeling and PPO to Direct Preference Optimization.
28 Feb 2026

Conceptual AI Agents Universe — System Design Document

A plugin-based platform architecture where each AI agent system is an independently subscribable capability. One interface, one orchestrator, unlimited agents — added without touching existing code.
28 Feb 2026

Distributional Reinforcement Learning and Learning the Full Return Distribution

Move beyond expected returns — learn why modeling the full distribution of rewards unlocks risk-aware agents, better exploration, and state-of-the-art performance
27 Feb 2026

Grounded Language Agents Connecting Words to Actions in the Physical World

How AI agents learn to connect language to perception and physical action, from the symbol grounding problem to modern vision-language-action models
27 Feb 2026

Offline Reinforcement Learning Training Agents from Fixed Datasets

Learn how agents can master complex tasks from pre-collected experience logs without ever touching a live environment, using conservative Q-learning, implicit Q-learning, and the Decision Transformer.
26 Feb 2026

Active Inference and the Free Energy Principle How Agents Minimize Surprise Instead of Maximizing Reward

Explore Karl Friston's Free Energy Principle: a unified theory where agents minimize surprise through belief updating and action, offering an alternative foundation to reward-based reinforcement learning
25 Feb 2026

Code Writing Agents and Program Synthesis Teaching AI to Build Its Own Tools

How AI agents generate, execute, and refine code as a reasoning medium, from classical program synthesis to modern REPL-based agent loops and SWE-bench architectures
24 Feb 2026

Continual Learning and Catastrophic Forgetting How Agents Remember Without Forgetting

How AI agents can learn continuously across tasks and environments without overwriting what they already know — the science and practice of lifelong machine learning