Hidden Assumptions in dbt-core

13 assumptions this code never checks · 6 critical · spanning Environment, Shape, Domain, Resource, Contract, Temporal

Every codebase relies on things it never checks. Most of them are routine. CodeSea looked at dbt-labs/dbt-core and picked out the few most likely to cause trouble. The full list is just below.

Most of what this code assumes is routine. These 3 are the ones most likely to cause trouble here. The rest are minor; they're under "Show everything".

Worth your attention first

KeyError crashes when any required environment variable is missing, or empty string passed to GitHub API causing authentication failures

Worth your attention first

requests.exceptions.ConnectionError or JSONDecodeError crashes the process when GitHub API is down or returns non-JSON response

Worth your attention first

KeyError or TypeError when API response structure changes, causing silent failures in version comparison logic

Show everything (10 more)

Domain

published_versions list is never empty when checking max(published_versions) and max(published_patches)

If this fails: ValueError: max() arg is an empty sequence crashes when no published versions exist for a new package

.github/actions/latest-wrangler/main.py:_new_version_tags

Resource

GITHUB_OUTPUT file path exists and is writable, and process has permissions to append to it

If this fails: FileNotFoundError or PermissionError when GitHub Actions runner environment doesn't provide writable output file

.github/actions/latest-wrangler/main.py:_register_tags

Environment

.env file format is valid and usecwd=True finds the correct working directory from user context

If this fails: load_dotenv silently fails or loads wrong .env file when called from subprocess with different working directory

core/dbt/cli/main.py:load_dotenv

Shape

kwargs keys match valid Click parameter names for the target command, and values are correct types

If this fails: Click parameter validation errors or silent parameter ignoring when kwargs contain invalid parameter names or wrong types

core/dbt/cli/main.py:dbtRunner.invoke

Contract

callbacks list contains callable objects that accept EventMsg parameter and don't raise exceptions

If this fails: TypeError or unhandled exceptions when callback functions have wrong signature or raise during event processing

core/dbt/cli/main.py:dbtRunner

Domain

path parameter is a valid file system path string suitable for use as a checksum

If this fails: Using path as checksum creates false cache hits when different files have same path string, breaking incremental parsing

core/dbt/artifacts/resources/base.py:FileHash.path

Shape

FileHash comparison is only used between FileHash instances with same hash algorithm (name field)

If this fails: False equality when comparing FileHashes with different algorithms but same checksum value, causing incorrect cache invalidation

core/dbt/artifacts/resources/base.py:FileHash.__eq__

Temporal

Invalid doc_blocks configuration is acceptable to silently convert to empty list rather than failing fast

If this fails: User configuration errors silently ignored, documentation blocks disappear without warning, making debugging difficult

core/dbt/artifacts/resources/v1/components.py:_backcompat_doc_blocks

Contract

All directories in sys.path with matching package name contain valid dbt submodules and won't conflict

If this fails: Module import errors or wrong module loading when conflicting packages exist in Python path with same dbt namespace

core/dbt/__init__.py:extend_path

Resource

System has sufficient memory to deep copy args list and create new Click context for each invocation

If this fails: MemoryError when args list is very large or system memory is constrained during programmatic usage

core/dbt/cli/main.py:dbtRunner.invoke

See the full structural analysis of dbt-core: the pipeline, data models, and system behavior that put these assumptions in context.

Full analysis of dbt-labs/dbt-core →

Compare dbt-core

prefect vs dbt

Frequently Asked Questions

What does dbt-core assume that could break in production?

The one most likely to cause trouble: Environment variables INPUT_PACKAGE_NAME, INPUT_NEW_VERSION, INPUT_GITHUB_TOKEN, and GITHUB_OUTPUT are always present and non-empty strings If this fails, KeyError crashes when any required environment variable is missing, or empty string passed to GitHub API causing authentication failures

How many hidden assumptions does dbt-core have?

CodeSea found 13 assumptions dbt-core relies on but never validates, 6 of them critical, spanning Environment, Shape, Domain, Resource, Contract, Temporal. Most are routine — the analysis flags the two or three most likely to actually bite.

What is a hidden assumption?

Something the code depends on but never checks: a data shape, an ordering, an environment condition, a scale limit, or a contract with another service. It holds until the world it runs in changes, then fails silently.