Hidden Assumptions in guidance

12 assumptions this code never checks · 4 critical · spanning Contract, Temporal, Environment, Ordering, Resource, Shape, Domain, Scale

Every codebase relies on things it never checks. Most of them are routine. CodeSea looked at guidance-ai/guidance and picked out the few most likely to cause trouble. The full list is just below.

Most of what this code assumes is routine. These 3 are the ones most likely to cause trouble here. The rest are minor; they're under "Show everything".

Worth your attention first

When using a custom tokenizer without _ll_tokenizer, the parser initialization fails with AttributeError instead of a clear validation error

Worth your attention first

If model initialization hangs, the widget polls indefinitely consuming CPU and never displays content, making notebooks unresponsive

Worth your attention first

Mismatched llguidance versions can cause silent failures or wrong constraint enforcement behavior without clear error messages

Show everything (9 more)

Ordering

Tag pool entries are populated before they are referenced in f-strings, but there's no guarantee that tagged functions are defined before use

If this fails: Using {{G|tag_name|G}} syntax before the tagged function is decorated results in KeyError during string parsing

guidance/_ast.py:_parse_tags

Resource

ThreadPoolExecutor with default settings can handle concurrent parser compilation without memory limits or cleanup

If this fails: Heavy grammar compilation workloads can exhaust memory or file descriptors, causing the entire application to become unresponsive

guidance/_parser.py:_parser_cache

Contract

Messages from the iframe contain valid JSON that can be parsed, but event.data structure is not validated

If this fails: Malformed messages from the sandboxed iframe can crash the widget with JSON parse errors or access undefined properties

packages/python/stitch/src/widget.ts:recvFromClient

Shape

GenData.mask is a bytes object with length matching the tokenizer vocabulary size, but never validates length or type

If this fails: Mask size mismatches cause numpy indexing errors or wrong tokens being masked, leading to invalid generation results

guidance/_parser.py:TokenParser.process_token

Domain

The tag delimiters {{G| and |G}} will never appear naturally in user strings as literal text to be generated

If this fails: User content containing these exact delimiters gets incorrectly parsed as function tags, breaking generation or causing undefined tag errors

guidance/_ast.py:tag_start and tag_end

Temporal

ContextVar state persists correctly across async boundaries and thread switches without corruption

If this fails: In async environments, stateless flag may leak between different guidance function executions, causing unexpected state sharing

guidance/_guidance.py:_in_stateless_context

Scale

100ms polling interval is appropriate for all model initialization speeds and doesn't cause performance issues

If this fails: Fast models waste CPU cycles with unnecessary polling; very slow models appear frozen to users who expect faster feedback

packages/python/stitch/src/widget.ts:refreshTimeMs

Environment

document.body is available as the mount target when the script loads, but DOM may not be ready

If this fails: If loaded before DOM ready, Svelte app fails to mount and visualization features silently don't work

client/graphpaper-inline/src/main.js

Contract

The decorated function f is callable and accepts the kwargs dictionary parameters, but no signature validation occurs

If this fails: Functions with incompatible signatures cause runtime TypeError when executed rather than failing early during decoration

guidance/_ast.py:Function dataclass

See the full structural analysis of guidance: the pipeline, data models, and system behavior that put these assumptions in context.

Full analysis of guidance-ai/guidance →

Compare guidance

guidance vs dspy

Frequently Asked Questions

What does guidance assume that could break in production?

The one most likely to cause trouble: The tokenizer parameter has a _ll_tokenizer attribute that is compatible with llguidance.LLInterpreter but never checks if this attribute exists or is the right type If this fails, When using a custom tokenizer without _ll_tokenizer, the parser initialization fails with AttributeError instead of a clear validation error

How many hidden assumptions does guidance have?

CodeSea found 12 assumptions guidance relies on but never validates, 4 of them critical, spanning Contract, Temporal, Environment, Ordering, Resource, Shape, Domain, Scale. Most are routine — the analysis flags the two or three most likely to actually bite.

What is a hidden assumption?

Something the code depends on but never checks: a data shape, an ordering, an environment condition, a scale limit, or a contract with another service. It holds until the world it runs in changes, then fails silently.