Programming Multi-Agent Conversations with AutoGen

06 Oct 2025

Concept Introduction

AutoGen is a framework for simplifying the orchestration, optimization, and automation of complex LLM workflows. Its central abstraction is the conversable agent: an independent unit that communicates via message passing and can be configured with different LLMs, prompts, and capabilities. Two key specializations are:

AssistantAgent: A standard AI agent that uses an LLM to reason, chat, and generate code. It acts as the “expert.”
UserProxyAgent: A proxy for a human user. It can solicit human input, but more importantly, it can execute code on behalf of the user. This ability to run code (e.g., in a local Python environment) and report back the results is AutoGen’s superpower.

The magic happens when you put them together. A UserProxyAgent can ask an AssistantAgent to write a script. The AssistantAgent sends back the code. The UserProxyAgent executes it, captures the output (or error), and sends that result back to the AssistantAgent for the next step. This loop continues until the task is complete.

Historical & Theoretical Context

AutoGen was developed by Microsoft Research, emerging from the need to move beyond single-agent, single-turn LLM interactions. It was released in late 2023.

Its intellectual roots lie in several fields:

Multi-Agent Systems (MAS): The classical AI field focused on how independent agents coordinate to solve problems beyond their individual capabilities.
Blackboard Systems: An early AI architecture where multiple “knowledge sources” (experts) would read and write to a shared workspace (the blackboard) to solve a problem collaboratively. The conversation history in AutoGen acts as a modern, sequential blackboard.
Human-in-the-Loop (HITL) Computing: AutoGen formalizes the human’s role as just another agent, making it seamless to integrate human feedback, correction, and guidance into the agentic workflow.

The Core Interaction Loop

There isn’t complex math, but the algorithmic flow is key. The conversation is the algorithm. A typical two-agent interaction looks like this:

sequenceDiagram
    participant User as User/Initiator
    participant Proxy as UserProxyAgent
    participant Assistant as AssistantAgent

    User->>Proxy: Initiate task (e.g., "Plot stock prices for NVDA")
    Proxy->>Assistant: "Plot stock prices for NVDA"
    Assistant-->>Proxy: Propose plan & Python code
    Note over Proxy: Receives code, does not execute yet.
    Proxy->>Proxy: Execute code in local environment
    alt Code Succeeds
        Proxy-->>Assistant: Send back result (e.g., "Plot saved to nvda_chart.png")
    else Code Fails
        Proxy-->>Assistant: Send back error message (e.g., "Error: yfinance not installed")
    end
    Assistant-->>Proxy: Analyze result/error, propose next step (e.g., "pip install yfinance")
    Note over Proxy,Assistant: Loop continues until task is complete...
    Proxy->>User: Final confirmation (e.g., "Task complete. Chart is saved.")

This turn-based, stateful conversation allows agents to correct mistakes, install dependencies, and iteratively refine their approach.

Design Patterns & Architectures

AutoGen embodies the Conversational Programming pattern. The control flow is not a rigid script but an emergent property of the conversation. Each message is an event that triggers a response from the next agent. Agents maintain conversation history as state, and the UserProxyAgent acts as a proxy for the user, abstracting away the details of code execution and environment management.

Practical Application

Here’s a small Python example of the interaction described above.

# pip install pyautogen
import autogen

# Configure the LLM provider
config_list = autogen.config_list_from_json(
    "OAI_CONFIG_LIST",
    filter_dict={"model": ["gpt-4"]},
)

# Create the AssistantAgent (the coder)
assistant = autogen.AssistantAgent(
    name="Coder",
    llm_config={"config_list": config_list},
)

# Create the UserProxyAgent (the code executor)
user_proxy = autogen.UserProxyAgent(
    name="Executor",
    human_input_mode="NEVER",  # Never ask for human input
    max_consecutive_auto_reply=10,
    is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
    code_execution_config={
        "work_dir": "coding",  # Directory to save files
        "use_docker": False,  # Set to True for sandboxed execution
    },
)

# Start the conversation
user_proxy.initiate_chat(
    assistant,
    message="""
    What is the current date? Compare the year-to-date performance
    of NVIDIA (NVDA) and Tesla (TSLA). Plot the results and save to a file.
    """,
)

In this setup, the Executor will run any Python code the Coder generates in the ./coding directory, sending back the results until the chart is created and the task is complete.

Latest Developments & Research

The AutoGen ecosystem is evolving rapidly. Key recent developments include:

GroupChatManager: A more complex agent to orchestrate conversations between multiple agents (e.g., a Coder, a Critic, and a Product Manager all in one chat). This moves from a two-agent dialogue to a true multi-agent collaboration.
Enhanced Tool Use: Better integration with function calling, allowing agents to reliably use external tools beyond just writing code.
AutoGen Studio: A UI for building and configuring AutoGen agent workflows, lowering the barrier to entry.

Research papers are now using AutoGen as a baseline for studying multi-agent behavior, automated software development, and even scientific discovery.

Cross-Disciplinary Insight

AutoGen’s conversational model has a parallel in Speech Act Theory from the philosophy of language. The theory posits that utterances are not just statements of fact but are themselves actions. When an agent sends a message like {"code": "print('hello')"}, it’s not just sharing information: it’s performing the act of requesting execution. The UserProxyAgent’s response is the act of reporting a result. This action-oriented view of language is what makes conversational programming tractable.

Daily Challenge / Thought Exercise

In under 30 minutes, do the following:

Sketch a multi-agent system for writing and improving documentation.
Define the agents: Who are they? (e.g., DocWriter, CodeScanner, ExampleGenerator, Reviewer).
Define their roles: What is the specific instruction/prompt for each agent?
Map the conversation flow: Who talks to whom and in what order? Use a Mermaid diagram if it helps. How does the process start and when does it end?

This exercise will force you to think about task decomposition and agent design, which are the core skills for building with AutoGen.

References & Further Reading

Paper: AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Official Documentation: Microsoft AutoGen Docs
GitHub Repository: microsoft/autogen
Blog Post Tutorial: AutoGen 101: A First Look

● Intelligence at Every Action

AI Native
Project Management

Stop using tools that bolt on AI as an afterthought. Jovis is built AI-first — smart routing, proactive monitoring, and intelligent workflows from the ground up.

Get early access → See how it works

Engineering Notes