foundationagents/metagpt

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

67,258 stars Python 10 components

Orchestrates multi-agent teams of LLM-powered roles to collaborate on software development projects

A user requirement enters through Team.run_project(), gets processed by specialized roles (ProductManager creates PRD, Architect designs system, Engineer writes code) where each role executes actions that call LLMs through providers, store outputs in memory/documents, and send messages to trigger downstream roles. The cycle continues until all deliverables are complete.

Under the hood, the system uses 4 feedback loops, 4 data pools, 6 control points to manage its runtime behavior.

A 10-component ml inference. 913 files analyzed. Data flows through 6 distinct pipeline stages.

How Data Flows Through the System

A user requirement enters through Team.run_project(), gets processed by specialized roles (ProductManager creates PRD, Architect designs system, Engineer writes code) where each role executes actions that call LLMs through providers, store outputs in memory/documents, and send messages to trigger downstream roles. The cycle continues until all deliverables are complete.

  1. Requirement Intake — Team.run_project() receives user requirement string and publishes UserRequirement message to all hired roles via Environment.publish_message() (config: llm.model, llm.api_key)
  2. Role Activation — Each Role checks if incoming Message.cause_by matches their _watch set, adds matching messages to memory via Memory.add(), and queues appropriate Actions in _rc.todo [Message → RoleContext]
  3. Action Execution — Role._act() pops Action from todo queue, calls Action.run() which formats prompts using PROMPT_TEMPLATE and calls LLMProvider.aask() or LLMProvider.acompletion() [RoleContext → ActionOutput] (config: llm.model, llm.base_url, llm.api_type)
  4. Output Processing — Action.run() processes LLM response, structures it using Pydantic models if defined, stores in ProjectRepo or DocumentStore, and returns ActionOutput [ActionOutput → Document]
  5. Message Broadcasting — Role._react() converts ActionOutput to Message, publishes via Environment.publish_message() to roles watching for this message type based on Message.cause_by [ActionOutput → Message]
  6. Team Coordination — Team.run() orchestrates n_round iterations, checking if all roles have completed their actions via Role.get_memories(), and manages team-wide termination conditions [Message]

Data Models

The data structures that flow between stages — the contracts that hold the system together.

Message metagpt/schema.py
Pydantic model with content: str, role: str (user/assistant/system), cause_by: Action class, send_to: Role or str, restricted_to: str, tag: str, and metadata fields
Created when agents communicate, routed through team message bus, stored in agent memory, consumed by receiving agents
ActionOutput metagpt/actions/__init__.py
Generic container with content: Any (text, code, or structured data), instruct_content: BaseModel for structured outputs
Generated by Action.run(), consumed by Role._act(), can be converted to Message for inter-agent communication
Context metagpt/context.py
Container with config: Config object, git_repo: GitRepository, src_workspace: Path, project_repo: ProjectRepo
Created during team setup, shared across all agents in a team, provides access to workspace and configuration
Document metagpt/document.py
Pydantic model with name: str, n_docs: int, n_chars: int, symbols: list for tracking document metadata and content statistics
Created by document-generating actions, stored in document_store, retrieved for context in subsequent actions
Node metagpt/ext/aflow/scripts/operator_an.py
Workflow graph node with id: str, operation: str, inputs: list[str], outputs: list[str] representing atomic operations in AFlow
Constructed during workflow generation, optimized through genetic algorithms, executed in dependency order
RoleContext metagpt/roles/role.py
Dict containing role_id: str, watch: set[Action types], news: list[Message], memory: Memory, todo: deque[Action]
Maintained throughout role lifespan, updated on each message, drives role's action decisions and execution

Hidden Assumptions

Things this code relies on but never validates. These are the things that cause silent failures when the system changes.

critical Environment unguarded

Assumes a Minecraft server is running on localhost with admin privileges allowing /clear, /kill, /give, and /item commands without authentication or permission checks

If this fails: Bot fails to execute setup commands if server lacks required permissions or plugins, causing silent initialization failure with equipment/inventory not matching expected state

metagpt/environment/minecraft/mineflayer/index.js:bot.chat()
critical Resource weakly guarded

Assumes LLM API has sufficient rate limits and quota to handle team.run(n_round=5) with multiple roles making concurrent API calls without hitting budget or rate limits

If this fails: Team execution silently fails or produces partial results when API quota exhausted, leaving some roles unable to complete their actions while others succeed

metagpt/team.py:company.invest()
critical Temporal unguarded

Assumes team.run_project() completes synchronously before await company.run(n_round=5) begins, but run_project only publishes initial message without waiting for processing completion

If this fails: Race condition where roles may not have processed the initial requirement message when the n_round execution loop starts, causing first round to execute with empty todo queues

examples/ui_with_chainlit/app.py:company.run_project()
warning Domain unguarded

Assumes equipment array has exactly 6 elements corresponding to [head, chest, legs, feet, mainhand, offhand] armor slots, but skips index 4 (mainhand) without validation

If this fails: Array index out of bounds or incorrect equipment assignment if client sends equipment array with different length or ordering than expected

metagpt/environment/minecraft/mineflayer/index.js:equipment array
warning Scale unguarded

Assumes genetic algorithm population size and generation limits are sufficient for dataset complexity, with hardcoded operators list per experiment type

If this fails: Optimization may converge to suboptimal solutions for complex datasets or fail to explore solution space adequately if population/generation limits too low

examples/aflow/optimize.py:Optimizer
warning Contract unguarded

Assumes LLM response contains valid Python code wrapped in ```python ``` markdown blocks that can be extracted and executed without syntax validation

If this fails: Generated agent code may contain syntax errors, security vulnerabilities, or malformed class definitions that cause runtime failures when instantiated

examples/agent_creator.py:CreateAgent.run()
warning Resource unguarded

Assumes Android device has sufficient storage space in /sdcard/Pictures/Screenshots and /sdcard directories for continuous screenshot and XML file generation

If this fails: Assistant fails when device storage full, causing screenshot capture to fail and breaking the observation-action loop without graceful degradation

examples/android_assistant/run_assistant.py:AndroidEnv
warning Environment weakly guarded

Assumes pathfinder and tool plugins load successfully within the setTimeout delay, but uses arbitrary 0ms timeout without checking load completion

If this fails: CollectBlock functionality may fail if dependent plugins haven't finished loading when bot tries to use pathfinder or tool capabilities

metagpt/environment/minecraft/mineflayer/mineflayer-collectblock/src/index.ts:setTimeout
warning Ordering weakly guarded

Assumes qa list contains dictionaries with 'question' and 'answer' keys, but strips whitespace from string conversion without validating dictionary structure

If this fails: AttributeError when qa items are not dictionaries or missing expected keys, causing template save to fail and losing user's optimization configuration

metagpt/ext/spo/app.py:save_yaml_template()
info Domain weakly guarded

Assumes all files in company workspace are text-based project deliverables suitable for display, filtering only .git files but not binary files, images, or system files

If this fails: UI may attempt to display binary files or large media files as text, causing display corruption or memory issues in the Chainlit interface

examples/ui_with_chainlit/app.py:files iteration

System Behavior

How the system operates at runtime — where data accumulates, what loops, what waits, and what controls what.

Data Pools

Agent Memory (in-memory)
Each agent maintains message history, experiences, and learned knowledge with similarity search capabilities
Document Store (database)
Vector database storing generated documents, code, and knowledge with embedding-based retrieval
Project Workspace (file-store)
File system workspace where generated code, documentation, and project artifacts are stored and versioned
Message Bus (queue)
Environment maintains message queue for inter-agent communication with role-based routing

Feedback Loops

Delays

Control Points

Technology Stack

Pydantic (serialization)
Provides data validation and structured outputs for agent messages and action results
asyncio (runtime)
Enables concurrent execution of multiple agents and asynchronous LLM API calls
ChromaDB/Qdrant/Milvus (database)
Vector databases for storing and retrieving documents with semantic similarity search
OpenAI/Anthropic/etc APIs (compute)
LLM providers that power agent reasoning, text generation, and structured output parsing
GitPython (infra)
Git operations for managing code repositories and version control in generated projects
Chainlit/Streamlit (framework)
Web UI frameworks for interactive agent interfaces and real-time conversation displays
Playwright/Selenium (infra)
Web automation tools enabling agents to interact with web browsers and scrape content
Mineflayer (library)
JavaScript Minecraft bot framework for creating agents that can interact with Minecraft servers

Key Components

Explore the interactive analysis

See the full architecture map, data flow, and code patterns visualization.

Analyze on CodeSea

Related Ml Inference Repositories

Frequently Asked Questions

What is MetaGPT used for?

Orchestrates multi-agent teams of LLM-powered roles to collaborate on software development projects foundationagents/metagpt is a 10-component ml inference written in Python. Data flows through 6 distinct pipeline stages. The codebase contains 913 files.

How is MetaGPT architected?

MetaGPT is organized into 4 architecture layers: Framework Core, Specialized Roles, Environment & Tools, Configuration & Infrastructure. Data flows through 6 distinct pipeline stages. This layered structure keeps concerns separated and modules independent.

How does data flow through MetaGPT?

Data moves through 6 stages: Requirement Intake → Role Activation → Action Execution → Output Processing → Message Broadcasting → .... A user requirement enters through Team.run_project(), gets processed by specialized roles (ProductManager creates PRD, Architect designs system, Engineer writes code) where each role executes actions that call LLMs through providers, store outputs in memory/documents, and send messages to trigger downstream roles. The cycle continues until all deliverables are complete. This pipeline design reflects a complex multi-stage processing system.

What technologies does MetaGPT use?

The core stack includes Pydantic (Provides data validation and structured outputs for agent messages and action results), asyncio (Enables concurrent execution of multiple agents and asynchronous LLM API calls), ChromaDB/Qdrant/Milvus (Vector databases for storing and retrieving documents with semantic similarity search), OpenAI/Anthropic/etc APIs (LLM providers that power agent reasoning, text generation, and structured output parsing), GitPython (Git operations for managing code repositories and version control in generated projects), Chainlit/Streamlit (Web UI frameworks for interactive agent interfaces and real-time conversation displays), and 2 more. A focused set of dependencies that keeps the build manageable.

What system dynamics does MetaGPT have?

MetaGPT exhibits 4 data pools (Agent Memory, Document Store), 4 feedback loops, 6 control points, 4 delays. The feedback loops handle convergence and recursive. These runtime behaviors shape how the system responds to load, failures, and configuration changes.

What design patterns does MetaGPT use?

5 design patterns detected: Agent-Action Composition, Message-Driven Architecture, Provider Pattern for LLMs, SOP (Standard Operating Procedure), Context Injection.

Analyzed on April 20, 2026 by CodeSea. Written by .