Hidden Assumptions in dbt-core
13 assumptions this code never checks · 6 critical · spanning Environment, Shape, Domain, Resource, Contract, Temporal
Every codebase relies on things it never checks. Most of them are routine. CodeSea looked at dbt-labs/dbt-core and picked out the few most likely to cause trouble. The full list is just below.
Most of what this code assumes is routine. These 3 are the ones most likely to cause trouble here. The rest are minor; they're under "Show everything".
KeyError crashes when any required environment variable is missing, or empty string passed to GitHub API causing authentication failures
requests.exceptions.ConnectionError or JSONDecodeError crashes the process when GitHub API is down or returns non-JSON response
KeyError or TypeError when API response structure changes, causing silent failures in version comparison logic
Show everything (10 more)
published_versions list is never empty when checking max(published_versions) and max(published_patches)
If this fails: ValueError: max() arg is an empty sequence crashes when no published versions exist for a new package
.github/actions/latest-wrangler/main.py:_new_version_tags
GITHUB_OUTPUT file path exists and is writable, and process has permissions to append to it
If this fails: FileNotFoundError or PermissionError when GitHub Actions runner environment doesn't provide writable output file
.github/actions/latest-wrangler/main.py:_register_tags
.env file format is valid and usecwd=True finds the correct working directory from user context
If this fails: load_dotenv silently fails or loads wrong .env file when called from subprocess with different working directory
core/dbt/cli/main.py:load_dotenv
kwargs keys match valid Click parameter names for the target command, and values are correct types
If this fails: Click parameter validation errors or silent parameter ignoring when kwargs contain invalid parameter names or wrong types
core/dbt/cli/main.py:dbtRunner.invoke
callbacks list contains callable objects that accept EventMsg parameter and don't raise exceptions
If this fails: TypeError or unhandled exceptions when callback functions have wrong signature or raise during event processing
core/dbt/cli/main.py:dbtRunner
path parameter is a valid file system path string suitable for use as a checksum
If this fails: Using path as checksum creates false cache hits when different files have same path string, breaking incremental parsing
core/dbt/artifacts/resources/base.py:FileHash.path
FileHash comparison is only used between FileHash instances with same hash algorithm (name field)
If this fails: False equality when comparing FileHashes with different algorithms but same checksum value, causing incorrect cache invalidation
core/dbt/artifacts/resources/base.py:FileHash.__eq__
Invalid doc_blocks configuration is acceptable to silently convert to empty list rather than failing fast
If this fails: User configuration errors silently ignored, documentation blocks disappear without warning, making debugging difficult
core/dbt/artifacts/resources/v1/components.py:_backcompat_doc_blocks
All directories in sys.path with matching package name contain valid dbt submodules and won't conflict
If this fails: Module import errors or wrong module loading when conflicting packages exist in Python path with same dbt namespace
core/dbt/__init__.py:extend_path
System has sufficient memory to deep copy args list and create new Click context for each invocation
If this fails: MemoryError when args list is very large or system memory is constrained during programmatic usage
core/dbt/cli/main.py:dbtRunner.invoke
See the full structural analysis of dbt-core: the pipeline, data models, and system behavior that put these assumptions in context.
Full analysis of dbt-labs/dbt-core →Compare dbt-core
Frequently Asked Questions
What does dbt-core assume that could break in production?
The one most likely to cause trouble: Environment variables INPUT_PACKAGE_NAME, INPUT_NEW_VERSION, INPUT_GITHUB_TOKEN, and GITHUB_OUTPUT are always present and non-empty strings If this fails, KeyError crashes when any required environment variable is missing, or empty string passed to GitHub API causing authentication failures
How many hidden assumptions does dbt-core have?
CodeSea found 13 assumptions dbt-core relies on but never validates, 6 of them critical, spanning Environment, Shape, Domain, Resource, Contract, Temporal. Most are routine — the analysis flags the two or three most likely to actually bite.
What is a hidden assumption?
Something the code depends on but never checks: a data shape, an ordering, an environment condition, a scale limit, or a contract with another service. It holds until the world it runs in changes, then fails silently.