keras-team/keras
Deep Learning for humans
Compiles neural network models across TensorFlow, JAX, and PyTorch backends with unified Python API
The system operates in two main flows: API generation where source code is scanned for @keras_export decorators to build the public interface, and runtime execution where user models compiled through the unified API are translated to backend-specific operations and executed on the chosen engine (TensorFlow/JAX/PyTorch).
Under the hood, the system uses 2 feedback loops, 3 data pools, 4 control points to manage its runtime behavior.
A 6-component library. 965 files analyzed. Data flows through 5 distinct pipeline stages.
How Data Flows Through the System
The system operates in two main flows: API generation where source code is scanned for @keras_export decorators to build the public interface, and runtime execution where user models compiled through the unified API are translated to backend-specific operations and executed on the chosen engine (TensorFlow/JAX/PyTorch).
- Scan source for exports — api_gen.py uses namex library to find @keras_export decorators in keras/src/ files and extract the public API surface definitions [source code with decorators → export registry]
- Generate public API — Creates modules in keras/api/ with proper imports and structure, mapping public names to internal implementations while maintaining backward compatibility [export registry → public API modules]
- Select backend at runtime — Environment variable KERAS_BACKEND or explicit configuration determines which backend (tensorflow/jax/pytorch/openvino) handles tensor operations [environment config → backend selection]
- Compile model operations — Layer operations defined in keras/src/ are translated through backend adapters to native operations for the selected execution engine [Model → backend-specific computation graph]
- Execute forward/backward passes — Training loop runs tensor computations on selected backend, with automatic differentiation and optimization handled by backend-specific implementations [Tensor → updated model weights]
Data Models
The data structures that flow between stages — the contracts that hold the system together.
keras/src/layers/layer.pybase class with call() method accepting tensors, state dict with weights/biases, and config dict with hyperparameters
Created with config parameters, weights initialized on first call, forward/backward passes during training, state saved/restored for checkpoints
keras/src/backendbackend-specific tensor (tf.Tensor, jax.Array, torch.Tensor) with shape metadata and dtype, wrapped in unified interface
Created from numpy arrays or other tensors, flows through model layers with shape transformations, converted between backend formats as needed
keras/src/modelscomputational graph with layers list, input/output specs, compiled state with optimizer/loss/metrics
Built from layer stack, compiled with optimizer and loss function, trained on datasets, saved as weights or full model
benchmarks/layer_benchmark/base_benchmark.pydict with layer_name: str, init_args: dict, input_shape: list, batch_size: int, jit_compile: bool
Configured via command-line flags, used to instantiate test layers, drives timing measurements across backends
Hidden Assumptions
Things this code relies on but never validates. These are the things that cause silent failures when the system changes.
Assumes the file system supports creating deeply nested directories and that shutil.copytree won't fail due to permissions, disk space, or file handle limits
If this fails: If disk is full or permissions are restricted, the API generation silently fails leaving the build directory in an inconsistent state, causing subsequent Keras imports to break with cryptic module not found errors
api_gen.py:copy_source_to_build_directory
Assumes assigneesList array members are valid GitHub usernames that exist and have repository access, processing them in array order for rotation without validation
If this fails: If a username in assigneesList is deleted, renamed, or loses access, GitHub API calls fail silently and issues remain unassigned, breaking the automatic triage workflow
.github/workflows/scripts/auto-assignment.js:module.exports
Assumes num_samples * batch_size tensor data fits in available memory without checking system memory limits or backend memory constraints
If this fails: Large num_samples values (like 100000 with batch_size 1000) cause out-of-memory crashes during benchmark initialization, with no graceful degradation or memory usage estimation
benchmarks/layer_benchmark/base_benchmark.py:FLAGS.num_samples
Assumes start_batch and stop_batch form a valid range where start_batch <= stop_batch and both are positive integers within the actual batch count
If this fails: If start_batch > stop_batch or stop_batch exceeds actual batches, timing measurements become invalid or the callback never triggers, producing misleading benchmark results
benchmarks/layer_benchmark/base_benchmark.py:BenchmarkMetricsCallback.__init__
Assumes test files are consistently named with '_test.py' suffix across the entire codebase and that no non-test files accidentally use this naming pattern
If this fails: If test files use different naming conventions (test_*.py, tests.py) they get included in the public API build, potentially exposing internal test utilities to end users
api_gen.py:ignore_files
Assumes 'mixed_float16' is supported by the current backend and hardware without checking for tensor core availability or backend-specific precision support
If this fails: On hardware without tensor cores or backends that don't support mixed precision, training either falls back to float32 silently (losing performance benefits) or crashes with backend-specific precision errors
benchmarks/model_benchmark/bert_benchmark.py:mixed_precision_policy
Assumes XLA compilation is available and compatible with the current backend and model operations without checking backend capabilities or operation support
If this fails: When jit_compile=True but XLA is unavailable or incompatible with specific layer operations, compilation fails with cryptic XLA errors that don't clearly indicate the JIT flag as the issue
benchmarks/layer_benchmark/base_benchmark.py:FLAGS.jit_compile
Assumes issue_title and issue_description are strings that support toLowerCase() method without null/undefined checks
If this fails: If GitHub webhook delivers issues with null titles or descriptions, the script crashes with 'cannot read property toLowerCase of null', causing the labeler workflow to fail silently
.github/workflows/scripts/labeler.js:issue_title.toLowerCase()
Assumes the package directory structure follows the expected keras/src/ layout and that _tf_keras/ directory creation won't conflict with existing files
If this fails: If source directory structure changes or _tf_keras already exists as a file instead of directory, the legacy directory creation fails causing backward compatibility features to break
api_gen.py:create_legacy_directory
Assumes model names like 'bert_tiny_en_uncased' exist in keras_nlp registry and are accessible without version or availability checks
If this fails: If keras_nlp version changes model names or removes models, benchmark crashes with model not found errors, making performance regression testing unreliable
benchmarks/model_benchmark/bert_benchmark.py:MODEL_SIZE_MAP
System Behavior
How the system operates at runtime — where data accumulates, what loops, what waits, and what controls what.
Data Pools
Maintains trainable parameters for all model layers, synchronized across backend transitions and checkpointing
Caches compiled operations for each backend to avoid recompilation when switching between backends or reusing models
Accumulates timing measurements and throughput metrics during performance testing across different layer types and configurations
Feedback Loops
- Backend switching optimization (auto-scale, reinforcing) — Trigger: Performance benchmarks showing one backend significantly faster for given model architecture. Action: Framework automatically recommends or switches to optimal backend based on measured throughput. Exit: User explicitly locks backend choice or performance difference becomes negligible.
- API generation validation (self-correction, balancing) — Trigger: Changes to @keras_export decorators or source file structure. Action: api_gen.py regenerates public API and validates imports resolve correctly. Exit: All public APIs successfully import and pass validation checks.
Delays
- Backend initialization (warmup, ~first model compilation) — First model compilation on each backend takes extra time to initialize backend-specific components and compile operations
- API generation (compilation, ~seconds to minutes) — Public API must be regenerated when source exports change, blocking development workflow until completion
- JIT compilation (compilation, ~varies by model size) — When jit_compile=True, first execution includes compilation time affecting benchmark accuracy
Control Points
- Backend selection (env-var) — Controls: Which execution engine (tensorflow/jax/pytorch/openvino) handles tensor operations. Default: KERAS_BACKEND environment variable
- Mixed precision policy (runtime-toggle) — Controls: Whether to use mixed_float16 for performance or float32 for numerical stability. Default: mixed_float16
- JIT compilation (architecture-switch) — Controls: Whether to enable XLA compilation for performance optimization during benchmarking. Default: jit_compile flag
- Batch size (hyperparameter) — Controls: Memory usage and computational efficiency during training and inference. Default: configurable via command line
Technology Stack
Scans Python source code for decorator patterns and generates API surface mappings during build process
One of four supported backends for executing neural network operations, providing GPU acceleration and distributed training
High-performance backend option for neural network execution, often fastest for certain model architectures
Backend providing eager execution and debugging capabilities for neural network operations
Provides command-line flag parsing and logging for benchmark scripts and development tools
Handles model serialization and checkpoint saving/loading in HDF5 format
Provides tree manipulation utilities for handling nested tensor structures across backends
Key Components
- LayerBenchmark (orchestrator) — Coordinates performance testing of individual layers by creating instances, generating test data, and measuring execution times across training and inference
benchmarks/layer_benchmark/base_benchmark.py - BenchmarkMetricsCallback (monitor) — Tracks throughput and timing during model training by hooking into batch begin/end events and calculating samples processed per second
benchmarks/layer_benchmark/base_benchmark.py - api_gen (generator) — Generates the public API surface by scanning keras/src/ for @keras_export decorators and creating corresponding modules in keras/api/ with proper imports
api_gen.py - copy_source_to_build_directory (processor) — Prepares build environment by copying source files to temporary directory, excluding test files, and setting up package structure for API generation
api_gen.py - auto-assignment (dispatcher) — Automatically assigns GitHub issues and PRs to maintainers on a rotating basis using predefined assignee lists for different types of contributions
.github/workflows/scripts/auto-assignment.js - labeler (processor) — Analyzes issue and PR titles/descriptions for keywords like 'gemma' and automatically applies corresponding labels to categorize contributions
.github/workflows/scripts/labeler.js
Explore the interactive analysis
See the full architecture map, data flow, and code patterns visualization.
Analyze on CodeSeaRelated Library Repositories
Frequently Asked Questions
What is keras used for?
Compiles neural network models across TensorFlow, JAX, and PyTorch backends with unified Python API keras-team/keras is a 6-component library written in Python. Data flows through 5 distinct pipeline stages. The codebase contains 965 files.
How is keras architected?
keras is organized into 4 architecture layers: Public API Surface, Core Implementation, Backend Adapters, Benchmarking Suite. Data flows through 5 distinct pipeline stages. This layered structure keeps concerns separated and modules independent.
How does data flow through keras?
Data moves through 5 stages: Scan source for exports → Generate public API → Select backend at runtime → Compile model operations → Execute forward/backward passes. The system operates in two main flows: API generation where source code is scanned for @keras_export decorators to build the public interface, and runtime execution where user models compiled through the unified API are translated to backend-specific operations and executed on the chosen engine (TensorFlow/JAX/PyTorch). This pipeline design reflects a complex multi-stage processing system.
What technologies does keras use?
The core stack includes namex (Scans Python source code for decorator patterns and generates API surface mappings during build process), TensorFlow (One of four supported backends for executing neural network operations, providing GPU acceleration and distributed training), JAX (High-performance backend option for neural network execution, often fastest for certain model architectures), PyTorch (Backend providing eager execution and debugging capabilities for neural network operations), absl (Provides command-line flag parsing and logging for benchmark scripts and development tools), h5py (Handles model serialization and checkpoint saving/loading in HDF5 format), and 1 more. A focused set of dependencies that keeps the build manageable.
What system dynamics does keras have?
keras exhibits 3 data pools (Model weights registry, Backend operation cache), 2 feedback loops, 4 control points, 3 delays. The feedback loops handle auto-scale and self-correction. These runtime behaviors shape how the system responds to load, failures, and configuration changes.
What design patterns does keras use?
4 design patterns detected: Multi-backend abstraction, Decorator-based API export, Benchmark-driven optimization, GitHub automation.
Analyzed on April 20, 2026 by CodeSea. Written by Karolina Sarna.