Hidden Assumptions in MetaGPT
15 assumptions this code never checks · 3 critical · spanning Environment, Resource, Temporal, Domain, Scale, Contract, Ordering
Every codebase relies on things it never checks. Most of them are routine. CodeSea looked at foundationagents/metagpt and picked out the few most likely to cause trouble. The full list is just below.
Most of what this code assumes is routine. These 3 are the ones most likely to cause trouble here. The rest are minor; they're under "Show everything".
Bot fails to execute setup commands if server lacks required permissions or plugins, causing silent initialization failure with equipment/inventory not matching expected state
Team execution silently fails or produces partial results when API quota exhausted, leaving some roles unable to complete their actions while others succeed
Race condition where roles may not have processed the initial requirement message when the n_round execution loop starts, causing first round to execute with empty todo queues
Show everything (12 more)
Assumes equipment array has exactly 6 elements corresponding to [head, chest, legs, feet, mainhand, offhand] armor slots, but skips index 4 (mainhand) without validation
If this fails: Array index out of bounds or incorrect equipment assignment if client sends equipment array with different length or ordering than expected
metagpt/environment/minecraft/mineflayer/index.js:equipment array
Assumes genetic algorithm population size and generation limits are sufficient for dataset complexity, with hardcoded operators list per experiment type
If this fails: Optimization may converge to suboptimal solutions for complex datasets or fail to explore solution space adequately if population/generation limits too low
examples/aflow/optimize.py:Optimizer
Assumes LLM response contains valid Python code wrapped in ```python ``` markdown blocks that can be extracted and executed without syntax validation
If this fails: Generated agent code may contain syntax errors, security vulnerabilities, or malformed class definitions that cause runtime failures when instantiated
examples/agent_creator.py:CreateAgent.run()
Assumes Android device has sufficient storage space in /sdcard/Pictures/Screenshots and /sdcard directories for continuous screenshot and XML file generation
If this fails: Assistant fails when device storage full, causing screenshot capture to fail and breaking the observation-action loop without graceful degradation
examples/android_assistant/run_assistant.py:AndroidEnv
Assumes pathfinder and tool plugins load successfully within the setTimeout delay, but uses arbitrary 0ms timeout without checking load completion
If this fails: CollectBlock functionality may fail if dependent plugins haven't finished loading when bot tries to use pathfinder or tool capabilities
metagpt/environment/minecraft/mineflayer/mineflayer-collectblock/src/index.ts:setTimeout
Assumes qa list contains dictionaries with 'question' and 'answer' keys, but strips whitespace from string conversion without validating dictionary structure
If this fails: AttributeError when qa items are not dictionaries or missing expected keys, causing template save to fail and losing user's optimization configuration
metagpt/ext/spo/app.py:save_yaml_template()
Assumes all files in company workspace are text-based project deliverables suitable for display, filtering only .git files but not binary files, images, or system files
If this fails: UI may attempt to display binary files or large media files as text, causing display corruption or memory issues in the Chainlit interface
examples/ui_with_chainlit/app.py:files iteration
Assumes mineflayer-tool plugin is compatible with the mineflayer version and available in node_modules, but doesn't handle plugin loading failures
If this fails: Bot lacks tool functionality if plugin missing or incompatible, silently degrading capabilities without error notification
metagpt/environment/minecraft/mineflayer/index.js:bot.loadPlugin
Assumes 30-pixel minimum distance between UI elements is appropriate for all Android screen densities and resolutions without considering DPI variations
If this fails: Element labeling may overlap on high-DPI devices or leave excessive gaps on low-resolution devices, affecting touch target accuracy
examples/android_assistant/run_assistant.py:min_dist parameter
Assumes each /give and /item command takes exactly one server tick to process, incrementing itemTicks linearly without considering server lag or command queuing
If this fails: Bot may attempt next command before previous inventory modification completes, causing commands to fail or produce inconsistent inventory state
metagpt/environment/minecraft/mineflayer/index.js:itemTicks counter
Assumes build_customized_agent.py file exists at METAGPT_ROOT/examples/ and is readable with valid UTF-8 encoding
If this fails: FileNotFoundError or encoding error when loading example code template, causing agent creation to fail without fallback example
examples/agent_creator.py:EXAMPLE_CODE_FILE.read_text()
Assumes result_data contains consistent structure with 'round', 'succeed', 'prompt', 'tokens', and 'answers' keys for each optimization result
If this fails: KeyError when displaying results if optimization process changes result structure, breaking the Streamlit UI display
metagpt/ext/spo/app.py:display_optimization_results()
See the full structural analysis of MetaGPT: the pipeline, data models, and system behavior that put these assumptions in context.
Full analysis of foundationagents/metagpt →Frequently Asked Questions
What does MetaGPT assume that could break in production?
The one most likely to cause trouble: Assumes a Minecraft server is running on localhost with admin privileges allowing /clear, /kill, /give, and /item commands without authentication or permission checks If this fails, Bot fails to execute setup commands if server lacks required permissions or plugins, causing silent initialization failure with equipment/inventory not matching expected state
How many hidden assumptions does MetaGPT have?
CodeSea found 15 assumptions MetaGPT relies on but never validates, 3 of them critical, spanning Environment, Resource, Temporal, Domain, Scale, Contract, Ordering. Most are routine — the analysis flags the two or three most likely to actually bite.
What is a hidden assumption?
Something the code depends on but never checks: a data shape, an ordering, an environment condition, a scale limit, or a contract with another service. It holds until the world it runs in changes, then fails silently.