Hidden Assumptions in airbyte

13 assumptions this code never checks · 3 critical · spanning Resource, Shape, Contract, Temporal, Environment, Ordering, Domain, Scale

Every codebase relies on things it never checks. Most of them are routine. CodeSea looked at airbytehq/airbyte and picked out the few most likely to cause trouble. The full list is just below.

Most of what this code assumes is routine. These 3 are the ones most likely to cause trouble here. The rest are minor; they're under "Show everything".

Worth your attention first

ResourceNotFoundException causing immediate connector failure with no graceful degradation

Worth your attention first

ConfigErrorException with misleading error message when fallback serialization fails or schema mismatch causes validation errors

Worth your attention first

Silent null values passed to downstream components or incorrect operation selection when CLI parsing is inconsistent

Show everything (10 more)
Temporal

operationProvider.get() is idempotent and safe to call multiple times during error recovery

If this fails: Resource leaks or initialization side effects if operation creation is retried after partial failure

airbyte-cdk/bulk/core/base/src/main/kotlin/io/airbyte/cdk/AirbyteConnectorRunnable.kt:run
Environment

System.getenv() and Instant.now() are available and functional when creating offset clock in test environment

If this fails: Clock factory fails to initialize in containerized or restricted environments where system time access is limited

airbyte-cdk/bulk/core/base/src/main/kotlin/io/airbyte/cdk/ClockFactory.kt:offset
Shape

AirbyteStateMessage objects with type=null are always safe to filter out and represent empty/initial state

If this fails: Loss of valid state data if null type actually represents a legitimate state format from older protocol versions

airbyte-cdk/bulk/core/base/src/main/kotlin/io/airbyte/cdk/command/InputStateFactory.kt:make
Contract

Operation.execute() handles its own error reporting and never throws exceptions that should be propagated to users differently than ConfigErrorException

If this fails: Important connector-specific errors get wrapped in generic failure messages, losing debugging context

airbyte-cdk/bulk/core/base/src/main/kotlin/io/airbyte/cdk/AirbyteConnectorRunnable.kt:run
Ordering

CLI operations are mutually exclusive and exactly one operation will be specified per invocation

If this fails: Undefined behavior if multiple operations somehow get through validation, potentially executing wrong operation

airbyte-cdk/bulk/core/base/src/main/kotlin/io/airbyte/cdk/command/ConnectorCommandLinePropertySource.kt:resolveValues
Domain

Epoch timestamp 3133641600 (year 2069) is always safe as a future test timestamp that won't conflict with real data

If this fails: Test failures or incorrect behavior in systems that validate timestamps against reasonable ranges or perform date arithmetic

airbyte-cdk/bulk/core/base/src/main/kotlin/io/airbyte/cdk/ClockFactory.kt:fakeNow
Environment

window.analytics Segment SDK is injected by Cloudflare and available when vote tracking executes

If this fails: Silent tracking failure with no user feedback when analytics is unavailable, potentially losing valuable user feedback data

docusaurus/src/theme/TOCItems/index.js:onVote
Scale

Stream validation is lightweight enough to run synchronously during catalog parsing for any number of streams

If this fails: Connector startup delays or timeouts when catalogs contain hundreds or thousands of streams

airbyte-cdk/bulk/core/base/src/main/kotlin/io/airbyte/cdk/command/ConfiguredCatalogFactory.kt:validateConfiguredStream
Resource

DOM event listeners can be safely attached/detached on every render without memory leaks

If this fails: Memory leaks in single-page applications where components mount/unmount frequently

docusaurus/src/theme/NavbarItem/DropdownNavbarItem/Desktop/index.js:useEffect
Contract

YamlPropertySourceLoader.read() always returns Map<String, Any?> with string keys that start with 'data.' prefix

If this fails: ClassCastException or incorrect property mapping if YAML structure differs from expected format

airbyte-cdk/bulk/core/base/src/main/kotlin/io/airbyte/cdk/command/MetadataYamlPropertySource.kt:loadFromResource

See the full structural analysis of airbyte: the pipeline, data models, and system behavior that put these assumptions in context.

Full analysis of airbytehq/airbyte →

Compare airbyte

Frequently Asked Questions

What does airbyte assume that could break in production?

The one most likely to cause trouble: metadata.yaml resource exists and is readable at startup time in classpath root If this fails, ResourceNotFoundException causing immediate connector failure with no graceful degradation

How many hidden assumptions does airbyte have?

CodeSea found 13 assumptions airbyte relies on but never validates, 3 of them critical, spanning Resource, Shape, Contract, Temporal, Environment, Ordering, Domain, Scale. Most are routine — the analysis flags the two or three most likely to actually bite.

What is a hidden assumption?

Something the code depends on but never checks: a data shape, an ordering, an environment condition, a scale limit, or a contract with another service. It holds until the world it runs in changes, then fails silently.