How layered LLM collaboration in the Mixture-of-Agents architecture produces outputs that consistently outperform any single model, and how to build it.
Anytime heuristic search, including Weighted A* and ARA*, lets agents commit to a suboptimal plan immediately and refine it as time allows, with provable bounds on solution quality at every step.
How STRIPS formalizes agent planning problems and how the delete relaxation trick produces powerful heuristics that guide search toward goals efficiently
How agents learn faster and more robustly by training on the right task at the right time — from hand-crafted curricula to adversarial environment generation