Teaching Agents to Learn from Mistakes Through Reflection and Self-Critique

11 Oct 2025

Concept Introduction

Reflection in AI agents is a meta-cognitive process where the agent:

Executes an action (e.g., generates code, answers a question, makes a plan).
Evaluates the outcome (Did it work? Why or why not?).
Generates self-critique (uses an LLM or evaluator to analyze the failure and produce actionable feedback).
Stores the reflection in memory (persists the lesson for future reference).
Uses the reflection to improve (on the next attempt, the agent references past failures to avoid repeating mistakes).

The most well-known implementation is Reflexion (Shinn et al., 2023), which extends the ReAct (Reason + Act) pattern with an explicit reflection phase. Instead of blindly retrying after failure, the agent writes a natural language critique of its own performance and uses that critique as context for the next iteration.

graph TD
    A[Task] --> B[Agent attempts task]
    B --> C{Success?}
    C -- Yes --> D[Done]
    C -- No --> E[Reflection: Why did it fail?]
    E --> F[Store lesson in memory]
    F --> G[Retry with reflection context]
    G --> B

Historical & Theoretical Context

The idea of learning from mistakes is ancient, but in AI, it has roots in several traditions:

Meta-Learning (1990s): The concept of “learning to learn,” using past experience to improve the learning process itself.
Reinforcement Learning (RL): Agents learn from rewards and penalties, but traditional RL lacks explicit introspection. It adjusts behavior through trial and error, not by reasoning about why a strategy failed.
Cognitive Psychology: Human metacognition (the ability to think about one’s own thinking) has been studied for decades. Researchers like John Flavell (1979) described how people monitor and regulate their own cognitive processes.

The modern incarnation for LLM agents emerged in 2023 with the Reflexion paper by Noah Shinn, Federico Cassano, and others at Northeastern University and MIT. They showed that adding a reflection step dramatically improved performance on coding tasks, decision-making benchmarks, and knowledge-intensive QA.

Unlike traditional RL, which updates model weights, Reflexion operates at the semantic level: it stores lessons as text in the agent’s memory, making it interpretable and composable with existing LLM-based architectures.

Algorithms & Math

The core algorithm is a loop with an added reflection phase. Here’s the pseudocode:

function reflexion_agent(task, max_trials):
  memory = []  # Stores past reflections

  for trial in 1 to max_trials:
    # 1. ACTOR: Generate a solution
    solution = actor.generate(task, memory)

    # 2. EVALUATOR: Test the solution
    success, feedback = evaluator.evaluate(solution, task)

    if success:
      return solution

    # 3. REFLECTOR: Generate self-critique
    reflection = reflector.critique(task, solution, feedback, memory)

    # 4. MEMORY: Store the reflection
    memory.append(reflection)

  return "Failed after max trials"

Key Components:

Actor: The LLM generates a candidate solution (e.g., writes code, answers a question).
Evaluator: An external oracle or heuristic that judges the solution (e.g., runs unit tests, checks for factual accuracy).
Reflector: Another LLM call (or the same LLM with a different prompt) that analyzes the failure and produces a natural language critique.
Memory: A short-term buffer (e.g., a list of reflections) that is injected into the actor’s context on the next trial.

No Complex Math: Unlike gradient-based learning, Reflexion relies on the LLM’s reasoning capabilities. The “learning” happens through prompt engineering and in-context learning, not parameter updates.

Design Patterns & Architectures

Reflection fits naturally into several agent architectures. In the Reflexion pattern, the original ReAct loop (Thought → Action → Observation) is extended so that after a failure, the agent generates a reflection instead of immediately retrying.

The Planner-Executor-Reflector loop is a three-phase cycle: the planner creates a high-level plan, the executor carries it out, and if the plan fails, the reflector analyzes why and suggests refinements. This is common in agent frameworks like LangGraph, where each phase is a node in a state machine graph.

For complex tasks, reflection can be hierarchical: low-level reflection critiques individual actions (e.g., “This function call failed because the API key was missing”), while high-level reflection critiques the overall strategy (e.g., “My approach of trying to brute-force the solution won’t scale; I should use dynamic programming instead”).

Practical Application

Let’s build a simple self-reflective agent that solves math word problems.

import re

# Mock LLM functions
def actor_llm(task, reflections):
    context = "\n".join(reflections) if reflections else "No prior attempts."
    prompt = f"Task: {task}\nPast Reflections:\n{context}\nProvide your solution:"
    print(f"\n[ACTOR] Generating solution...\n{prompt}")

    # Simulate different responses based on reflections
    if "use correct formula" in context.lower():
        return "The area of a circle is π * r^2. If radius is 5, area = 3.14159 * 25 = 78.54"
    return "The area is 2 * π * r. If radius is 5, area = 31.4"  # Wrong formula

def evaluator(solution, expected_answer):
    print(f"\n[EVALUATOR] Testing solution: {solution}")
    if "78.5" in solution:
        print("✓ Correct!")
        return True, "Solution is correct."
    print("✗ Incorrect!")
    return False, f"Expected answer near 78.54, but solution gave {solution}"

def reflector_llm(task, solution, feedback, reflections):
    prompt = f"Task: {task}\nYour solution: {solution}\nFeedback: {feedback}\nWhat went wrong?"
    print(f"\n[REFLECTOR] Analyzing failure...\n{prompt}")

    # Simulate reflection
    if "31.4" in solution:
        return "I used the circumference formula (2πr) instead of the area formula (πr²). I need to use the correct formula for area."
    return "Unknown error. Try a different approach."

def reflexion_agent(task, expected_answer, max_trials=3):
    memory = []

    for trial in range(1, max_trials + 1):
        print(f"\n{'='*60}\nTRIAL {trial}\n{'='*60}")

        solution = actor_llm(task, memory)
        success, feedback = evaluator(solution, expected_answer)

        if success:
            print(f"\n✓ Task solved in {trial} trial(s)!")
            return solution

        reflection = reflector_llm(task, solution, feedback, memory)
        memory.append(reflection)
        print(f"\n[MEMORY] Stored reflection: {reflection}")

    print("\n✗ Failed to solve after max trials.")
    return None

# Run the agent
task = "What is the area of a circle with radius 5?"
reflexion_agent(task, expected_answer=78.54)

Output:

============================================================
TRIAL 1
============================================================

[ACTOR] Generating solution...
Task: What is the area of a circle with radius 5?
Past Reflections:
No prior attempts.
Provide your solution:

[EVALUATOR] Testing solution: The area is 2 * π * r. If radius is 5, area = 31.4
✗ Incorrect!

[REFLECTOR] Analyzing failure...
Task: What is the area of a circle with radius 5?
Your solution: The area is 2 * π * r. If radius is 5, area = 31.4
Feedback: Expected answer near 78.54, but solution gave The area is 2 * π * r. If radius is 5, area = 31.4
What went wrong?

[MEMORY] Stored reflection: I used the circumference formula (2πr) instead of the area formula (πr²). I need to use the correct formula for area.

============================================================
TRIAL 2
============================================================

[ACTOR] Generating solution...
Task: What is the area of a circle with radius 5?
Past Reflections:
I used the circumference formula (2πr) instead of the area formula (πr²). I need to use the correct formula for area.
Provide your solution:

[EVALUATOR] Testing solution: The area of a circle is π * r^2. If radius is 5, area = 3.14159 * 25 = 78.54
✓ Correct!

✓ Task solved in 2 trial(s)!

In Modern Frameworks:

LangGraph: You can create a graph with actor, evaluator, and reflector nodes, with conditional edges that loop back to the actor if the evaluator returns failure.
CrewAI: Assign a “Critic” agent whose job is to review and provide feedback on the work of other agents.
AutoGen: Use a UserProxyAgent as an evaluator (e.g., running code) and an AssistantAgent as both actor and reflector.

Latest Developments & Research

2023: The Year of Reflexion

Reflexion (Shinn et al., 2023): The foundational paper showed 91% pass rate on HumanEval coding tasks (vs. 67% for baseline ReAct).
Self-Refine (Madaan et al., 2023): A similar approach where the agent iteratively critiques and refines its own outputs without external feedback.

2024-2025: Advanced Self-Correction

CRITIC (Gou et al., 2024): Uses external tools (e.g., search engines, code interpreters) to verify answers before accepting them, then reflects on discrepancies.
Reflexion with Long-Term Memory: Researchers are integrating vector databases to store reflections from past sessions, enabling agents to learn across tasks (e.g., remembering that “API X tends to timeout on large requests”).
Multi-Agent Reflection: Systems where one agent acts and another agent provides critique, avoiding the “judge and jury” problem where a single LLM must evaluate its own work.

Open Problems:

Hallucinated Reflections: LLMs sometimes generate plausible-sounding but incorrect critiques, leading the agent astray.
Overfitting to Reflections: Agents might overweight recent failures and miss the bigger picture.
Scaling to Complex Domains: Reflection works well for bounded tasks (coding, math) but is less proven for open-ended creative tasks.

Cross-Disciplinary Insight

Reflection in AI agents mirrors Kolb’s Experiential Learning Cycle from education theory:

Concrete Experience: The agent attempts a task.
Reflective Observation: The agent analyzes what happened.
Abstract Conceptualization: The agent formulates a lesson or rule (“I should always validate inputs before processing”).
Active Experimentation: The agent applies the lesson in the next attempt.

This cycle is foundational in adult learning and professional development.

Daily Challenge / Thought Exercise

Exercise: Build a Self-Correcting Code Generator

Task (30 minutes):

Write a Python function reflexive_code_generator(task_description, test_cases, max_trials=3) that:
- Uses an LLM to generate Python code based on task_description.
- Runs the code against test_cases.
- If tests fail, generates a reflection on why the code failed.
- Retries with the reflection in context.
Test it with:
- Task: “Write a function is_palindrome(s) that returns True if string s is a palindrome.”
- Test Cases:
  - is_palindrome("racecar") → True
  - is_palindrome("hello") → False
  - is_palindrome("A man a plan a canal Panama") → True (case-insensitive, ignores spaces)
Bonus: Make the first attempt deliberately naive (e.g., simple string reversal without handling case or spaces) to trigger a reflection loop.

Reflection prompt for the LLM:

The code failed on test case: {test_input}
Expected: {expected_output}
Got: {actual_output}
Error (if any): {error_message}

What went wrong with the code, and how should it be fixed?

References & Further Reading

Foundational Papers:

Tutorials & Blogs:

Implementations:

● Intelligence at Every Action

AI Native
Project Management

Stop using tools that bolt on AI as an afterthought. Jovis is built AI-first — smart routing, proactive monitoring, and intelligent workflows from the ground up.

Get early access → See how it works

Engineering Notes