Hidden Assumptions in dspy

12 assumptions this code never checks · 3 critical · spanning Domain, Contract, Environment, Shape, Ordering, Resource, Scale, Temporal

Every codebase relies on things it never checks. Most of them are routine. CodeSea looked at stanfordnlp/dspy and picked out the few most likely to cause trouble. The full list is just below.

Most of what this code assumes is routine. These 3 are the ones most likely to cause trouble here. The rest are minor; they're under "Show everything".

Worth your attention first

If the LM includes these patterns in its response text, the parser will incorrectly split the content at those points, breaking field extraction and potentially losing data

Worth your attention first

If JSONAdapter is misconfigured, unavailable, or fails on the same input, the fallback chain breaks and parsing permanently fails with no recovery mechanism

Worth your attention first

If a custom type returns unexpected formats like nested lists or non-dict objects, the adapter formatting pipeline will crash when trying to serialize the content for LM APIs

Show everything (9 more)
Environment

The LITELLM_LOCAL_MODEL_COST_MAP environment variable can be safely set to 'True' without conflicting with user's existing environment configuration

If this fails: If user has explicitly set this variable to a different value or path, DSPy silently overrides it, potentially breaking their cost tracking or billing integration

dspy/__init__.py:os.environ.setdefault
Ordering

Field processing order from signature input_fields and output_fields dictionaries is deterministic and consistent across Python versions and executions

If this fails: If dict iteration order changes between runs, field headers appear in different orders in prompts, causing LMs trained on specific formats to generate inconsistent responses

dspy/adapters/chat_adapter.py:FieldInfoWithName
Domain

Field values can be successfully parsed from string representations back to their original types without precision loss or format ambiguity

If this fails: Complex types like datetime objects, custom classes, or floating-point numbers with specific precision requirements may lose information during string round-trip, silently corrupting data

dspy/adapters/utils.py:parse_value
Resource

The disk cache directory is writable and has sufficient space for storing LM responses across all concurrent DSPy programs

If this fails: If cache directory becomes full or unwritable, all LM calls will fail silently or fall back to uncached mode, drastically increasing API costs and latency without user awareness

dspy/clients/cache.py:DSPY_CACHE
Scale

Chat-formatted prompts with field headers and demonstrations will fit within the language model's context window limit

If this fails: Large signatures with many fields or extensive few-shot examples will exceed context limits, causing ContextWindowExceededError but only after expensive prompt formatting has been completed

dspy/adapters/chat_adapter.py:ChatAdapter
Contract

All signature field types have meaningful string representations and can be serialized/deserialized through the adapter pipeline

If this fails: Custom types without proper __str__ methods or non-serializable objects will cause cryptic errors during prompt formatting, making debugging difficult

dspy/signatures/signature.py:Signature
Temporal

Language model API responses arrive in reasonable time and don't timeout during streaming or batch processing

If this fails: Long-running optimizations like GEPA or BootstrapFewShot may fail partway through if individual LM calls timeout, losing all progress and requiring complete restart

dspy/clients/base_lm.py:BaseLM
Domain

The json_repair library can successfully fix malformed JSON from language models without changing semantic meaning

If this fails: If json_repair 'fixes' JSON by altering actual content (e.g., removing quotes from intended string values), parsed results will be semantically incorrect while appearing syntactically valid

dspy/adapters/types/base_type.py:json_repair
Environment

Global settings state is thread-safe and won't cause race conditions when multiple DSPy programs run concurrently

If this fails: Concurrent programs sharing the same process may experience configuration bleeding where one program's LM settings affect another's execution, leading to inconsistent results

dspy/dsp/utils/settings.py:settings

See the full structural analysis of dspy: the pipeline, data models, and system behavior that put these assumptions in context.

Full analysis of stanfordnlp/dspy →

Compare dspy

Frequently Asked Questions

What does dspy assume that could break in production?

The one most likely to cause trouble: Language models will treat field headers like [[ ## field_name ## ]] as meaningful delimiters and not generate them as part of actual content If this fails, If the LM includes these patterns in its response text, the parser will incorrectly split the content at those points, breaking field extraction and potentially losing data

How many hidden assumptions does dspy have?

CodeSea found 12 assumptions dspy relies on but never validates, 3 of them critical, spanning Domain, Contract, Environment, Shape, Ordering, Resource, Scale, Temporal. Most are routine — the analysis flags the two or three most likely to actually bite.

What is a hidden assumption?

Something the code depends on but never checks: a data shape, an ordering, an environment condition, a scale limit, or a contract with another service. It holds until the world it runs in changes, then fails silently.