Hidden Assumptions in keras

13 assumptions this code never checks · 3 critical · spanning Environment, Ordering, Scale, Temporal, Contract, Domain, Resource, Shape

Every codebase relies on things it never checks. Most of them are routine. CodeSea looked at keras-team/keras and picked out the few most likely to cause trouble. The full list is just below.

Most of what this code assumes is routine. These 3 are the ones most likely to cause trouble here. The rest are minor; they're under "Show everything".

Worth your attention first

If disk is full or permissions are restricted, the API generation silently fails leaving the build directory in an inconsistent state, causing subsequent Keras imports to break with cryptic module not found errors

Worth your attention first

If a username in assigneesList is deleted, renamed, or loses access, GitHub API calls fail silently and issues remain unassigned, breaking the automatic triage workflow

Worth your attention first

Large num_samples values (like 100000 with batch_size 1000) cause out-of-memory crashes during benchmark initialization, with no graceful degradation or memory usage estimation

Show everything (10 more)
Temporal

Assumes start_batch and stop_batch form a valid range where start_batch <= stop_batch and both are positive integers within the actual batch count

If this fails: If start_batch > stop_batch or stop_batch exceeds actual batches, timing measurements become invalid or the callback never triggers, producing misleading benchmark results

benchmarks/layer_benchmark/base_benchmark.py:BenchmarkMetricsCallback.__init__
Contract

Assumes test files are consistently named with '_test.py' suffix across the entire codebase and that no non-test files accidentally use this naming pattern

If this fails: If test files use different naming conventions (test_*.py, tests.py) they get included in the public API build, potentially exposing internal test utilities to end users

api_gen.py:ignore_files
Domain

Assumes 'mixed_float16' is supported by the current backend and hardware without checking for tensor core availability or backend-specific precision support

If this fails: On hardware without tensor cores or backends that don't support mixed precision, training either falls back to float32 silently (losing performance benefits) or crashes with backend-specific precision errors

benchmarks/model_benchmark/bert_benchmark.py:mixed_precision_policy
Resource

Assumes XLA compilation is available and compatible with the current backend and model operations without checking backend capabilities or operation support

If this fails: When jit_compile=True but XLA is unavailable or incompatible with specific layer operations, compilation fails with cryptic XLA errors that don't clearly indicate the JIT flag as the issue

benchmarks/layer_benchmark/base_benchmark.py:FLAGS.jit_compile
Shape

Assumes issue_title and issue_description are strings that support toLowerCase() method without null/undefined checks

If this fails: If GitHub webhook delivers issues with null titles or descriptions, the script crashes with 'cannot read property toLowerCase of null', causing the labeler workflow to fail silently

.github/workflows/scripts/labeler.js:issue_title.toLowerCase()
Environment

Assumes the package directory structure follows the expected keras/src/ layout and that _tf_keras/ directory creation won't conflict with existing files

If this fails: If source directory structure changes or _tf_keras already exists as a file instead of directory, the legacy directory creation fails causing backward compatibility features to break

api_gen.py:create_legacy_directory
Contract

Assumes model names like 'bert_tiny_en_uncased' exist in keras_nlp registry and are accessible without version or availability checks

If this fails: If keras_nlp version changes model names or removes models, benchmark crashes with model not found errors, making performance regression testing unreliable

benchmarks/model_benchmark/bert_benchmark.py:MODEL_SIZE_MAP
Ordering

Assumes issues are processed sequentially for rotation logic to work correctly, but GitHub webhooks may deliver events out of order or in parallel

If this fails: Concurrent issue creation can break round-robin assignment logic, causing uneven distribution of issues to maintainers or assigning the same person multiple consecutive issues

.github/workflows/scripts/auto-assignment.js:assigneesList rotation
Scale

Assumes batch_size is appropriate for the selected backend and available memory without considering backend-specific batch size limitations or optimal sizes

If this fails: Very large batch sizes may exceed backend memory limits or very small ones may underutilize hardware, leading to misleading benchmark results that don't reflect real-world performance

benchmarks/layer_benchmark/base_benchmark.py:FLAGS.batch_size
Temporal

Assumes 'tmp_build_dir' name won't conflict with other processes or concurrent API generation runs in the same directory

If this fails: Multiple parallel API generation processes overwrite each other's build directories, causing race conditions and corrupted API generation with inconsistent public interface

api_gen.py:BUILD_DIR_NAME constant

See the full structural analysis of keras: the pipeline, data models, and system behavior that put these assumptions in context.

Full analysis of keras-team/keras →

Frequently Asked Questions

What does keras assume that could break in production?

The one most likely to cause trouble: Assumes the file system supports creating deeply nested directories and that shutil.copytree won't fail due to permissions, disk space, or file handle limits If this fails, If disk is full or permissions are restricted, the API generation silently fails leaving the build directory in an inconsistent state, causing subsequent Keras imports to break with cryptic module not found errors

How many hidden assumptions does keras have?

CodeSea found 13 assumptions keras relies on but never validates, 3 of them critical, spanning Environment, Ordering, Scale, Temporal, Contract, Domain, Resource, Shape. Most are routine — the analysis flags the two or three most likely to actually bite.

What is a hidden assumption?

Something the code depends on but never checks: a data shape, an ordering, an environment condition, a scale limit, or a contract with another service. It holds until the world it runs in changes, then fails silently.