keras-team/keras

Deep Learning for humans

64,016 stars Python 6 components

Compiles neural network models across TensorFlow, JAX, and PyTorch backends with unified Python API

The system operates in two main flows: API generation where source code is scanned for @keras_export decorators to build the public interface, and runtime execution where user models compiled through the unified API are translated to backend-specific operations and executed on the chosen engine (TensorFlow/JAX/PyTorch).

Under the hood, the system uses 2 feedback loops, 3 data pools, 4 control points to manage its runtime behavior.

A 6-component library. 965 files analyzed. Data flows through 5 distinct pipeline stages.

How Data Flows Through the System

Scan source for exports — api_gen.py uses namex library to find @keras_export decorators in keras/src/ files and extract the public API surface definitions [source code with decorators → export registry]
Generate public API — Creates modules in keras/api/ with proper imports and structure, mapping public names to internal implementations while maintaining backward compatibility [export registry → public API modules]
Select backend at runtime — Environment variable KERAS_BACKEND or explicit configuration determines which backend (tensorflow/jax/pytorch/openvino) handles tensor operations [environment config → backend selection]
Compile model operations — Layer operations defined in keras/src/ are translated through backend adapters to native operations for the selected execution engine [Model → backend-specific computation graph]
Execute forward/backward passes — Training loop runs tensor computations on selected backend, with automatic differentiation and optimization handled by backend-specific implementations [Tensor → updated model weights]

Data Models

The data structures that flow between stages — the contracts that hold the system together.

Layer keras/src/layers/layer.py
base class with call() method accepting tensors, state dict with weights/biases, and config dict with hyperparameters
Created with config parameters, weights initialized on first call, forward/backward passes during training, state saved/restored for checkpoints

Tensor keras/src/backend
backend-specific tensor (tf.Tensor, jax.Array, torch.Tensor) with shape metadata and dtype, wrapped in unified interface
Created from numpy arrays or other tensors, flows through model layers with shape transformations, converted between backend formats as needed

Model keras/src/models
computational graph with layers list, input/output specs, compiled state with optimizer/loss/metrics
Built from layer stack, compiled with optimizer and loss function, trained on datasets, saved as weights or full model

BenchmarkConfig benchmarks/layer_benchmark/base_benchmark.py
dict with layer_name: str, init_args: dict, input_shape: list, batch_size: int, jit_compile: bool
Configured via command-line flags, used to instantiate test layers, drives timing measurements across backends

Hidden Assumptions

Things this code relies on but never validates. These are the things that cause silent failures when the system changes.

critical Environment unguarded

Assumes the file system supports creating deeply nested directories and that shutil.copytree won't fail due to permissions, disk space, or file handle limits

If this fails: If disk is full or permissions are restricted, the API generation silently fails leaving the build directory in an inconsistent state, causing subsequent Keras imports to break with cryptic module not found errors

api_gen.py:copy_source_to_build_directory

critical Ordering unguarded

Assumes assigneesList array members are valid GitHub usernames that exist and have repository access, processing them in array order for rotation without validation

If this fails: If a username in assigneesList is deleted, renamed, or loses access, GitHub API calls fail silently and issues remain unassigned, breaking the automatic triage workflow

.github/workflows/scripts/auto-assignment.js:module.exports

critical Scale unguarded

Assumes num_samples * batch_size tensor data fits in available memory without checking system memory limits or backend memory constraints

If this fails: Large num_samples values (like 100000 with batch_size 1000) cause out-of-memory crashes during benchmark initialization, with no graceful degradation or memory usage estimation

benchmarks/layer_benchmark/base_benchmark.py:FLAGS.num_samples

warning Temporal unguarded

Assumes start_batch and stop_batch form a valid range where start_batch <= stop_batch and both are positive integers within the actual batch count

If this fails: If start_batch > stop_batch or stop_batch exceeds actual batches, timing measurements become invalid or the callback never triggers, producing misleading benchmark results

benchmarks/layer_benchmark/base_benchmark.py:BenchmarkMetricsCallback.__init__

warning Contract weakly guarded

Assumes test files are consistently named with '_test.py' suffix across the entire codebase and that no non-test files accidentally use this naming pattern

If this fails: If test files use different naming conventions (test_*.py, tests.py) they get included in the public API build, potentially exposing internal test utilities to end users

api_gen.py:ignore_files

warning Domain unguarded

Assumes 'mixed_float16' is supported by the current backend and hardware without checking for tensor core availability or backend-specific precision support

If this fails: On hardware without tensor cores or backends that don't support mixed precision, training either falls back to float32 silently (losing performance benefits) or crashes with backend-specific precision errors

benchmarks/model_benchmark/bert_benchmark.py:mixed_precision_policy

warning Resource unguarded

Assumes XLA compilation is available and compatible with the current backend and model operations without checking backend capabilities or operation support

If this fails: When jit_compile=True but XLA is unavailable or incompatible with specific layer operations, compilation fails with cryptic XLA errors that don't clearly indicate the JIT flag as the issue

benchmarks/layer_benchmark/base_benchmark.py:FLAGS.jit_compile

warning Shape weakly guarded

Assumes issue_title and issue_description are strings that support toLowerCase() method without null/undefined checks

If this fails: If GitHub webhook delivers issues with null titles or descriptions, the script crashes with 'cannot read property toLowerCase of null', causing the labeler workflow to fail silently

.github/workflows/scripts/labeler.js:issue_title.toLowerCase()

warning Environment unguarded

Assumes the package directory structure follows the expected keras/src/ layout and that _tf_keras/ directory creation won't conflict with existing files

If this fails: If source directory structure changes or _tf_keras already exists as a file instead of directory, the legacy directory creation fails causing backward compatibility features to break

api_gen.py:create_legacy_directory

info Contract unguarded

Assumes model names like 'bert_tiny_en_uncased' exist in keras_nlp registry and are accessible without version or availability checks

If this fails: If keras_nlp version changes model names or removes models, benchmark crashes with model not found errors, making performance regression testing unreliable

benchmarks/model_benchmark/bert_benchmark.py:MODEL_SIZE_MAP

System Behavior

How the system operates at runtime — where data accumulates, what loops, what waits, and what controls what.

Data Pools

Model weights registry (state-store)
Maintains trainable parameters for all model layers, synchronized across backend transitions and checkpointing

Backend operation cache (cache)
Caches compiled operations for each backend to avoid recompilation when switching between backends or reusing models

Benchmark results store (buffer)
Accumulates timing measurements and throughput metrics during performance testing across different layer types and configurations

Feedback Loops

Backend switching optimization (auto-scale, reinforcing) — Trigger: Performance benchmarks showing one backend significantly faster for given model architecture. Action: Framework automatically recommends or switches to optimal backend based on measured throughput. Exit: User explicitly locks backend choice or performance difference becomes negligible.
API generation validation (self-correction, balancing) — Trigger: Changes to @keras_export decorators or source file structure. Action: api_gen.py regenerates public API and validates imports resolve correctly. Exit: All public APIs successfully import and pass validation checks.

Delays

Backend initialization (warmup, ~first model compilation) — First model compilation on each backend takes extra time to initialize backend-specific components and compile operations
API generation (compilation, ~seconds to minutes) — Public API must be regenerated when source exports change, blocking development workflow until completion
JIT compilation (compilation, ~varies by model size) — When jit_compile=True, first execution includes compilation time affecting benchmark accuracy

Control Points

Backend selection (env-var) — Controls: Which execution engine (tensorflow/jax/pytorch/openvino) handles tensor operations. Default: KERAS_BACKEND environment variable
Mixed precision policy (runtime-toggle) — Controls: Whether to use mixed_float16 for performance or float32 for numerical stability. Default: mixed_float16
JIT compilation (architecture-switch) — Controls: Whether to enable XLA compilation for performance optimization during benchmarking. Default: jit_compile flag
Batch size (hyperparameter) — Controls: Memory usage and computational efficiency during training and inference. Default: configurable via command line

Technology Stack

namex (build)
Scans Python source code for decorator patterns and generates API surface mappings during build process

TensorFlow (runtime)
One of four supported backends for executing neural network operations, providing GPU acceleration and distributed training

JAX (runtime)
High-performance backend option for neural network execution, often fastest for certain model architectures

PyTorch (runtime)
Backend providing eager execution and debugging capabilities for neural network operations

absl (library)
Provides command-line flag parsing and logging for benchmark scripts and development tools

h5py (serialization)
Handles model serialization and checkpoint saving/loading in HDF5 format

optree (library)
Provides tree manipulation utilities for handling nested tensor structures across backends

Key Components

LayerBenchmark (orchestrator) — Coordinates performance testing of individual layers by creating instances, generating test data, and measuring execution times across training and inference benchmarks/layer_benchmark/base_benchmark.py
BenchmarkMetricsCallback (monitor) — Tracks throughput and timing during model training by hooking into batch begin/end events and calculating samples processed per second benchmarks/layer_benchmark/base_benchmark.py
api_gen (generator) — Generates the public API surface by scanning keras/src/ for @keras_export decorators and creating corresponding modules in keras/api/ with proper imports api_gen.py
copy_source_to_build_directory (processor) — Prepares build environment by copying source files to temporary directory, excluding test files, and setting up package structure for API generation api_gen.py
auto-assignment (dispatcher) — Automatically assigns GitHub issues and PRs to maintainers on a rotating basis using predefined assignee lists for different types of contributions .github/workflows/scripts/auto-assignment.js
labeler (processor) — Analyzes issue and PR titles/descriptions for keywords like 'gemma' and automatically applies corresponding labels to categorize contributions .github/workflows/scripts/labeler.js

Explore the interactive analysis

See the full architecture map, data flow, and code patterns visualization.

Analyze on CodeSea

Related Library Repositories

Frequently Asked Questions

What is keras used for?

Compiles neural network models across TensorFlow, JAX, and PyTorch backends with unified Python API keras-team/keras is a 6-component library written in Python. Data flows through 5 distinct pipeline stages. The codebase contains 965 files.

How is keras architected?

keras is organized into 4 architecture layers: Public API Surface, Core Implementation, Backend Adapters, Benchmarking Suite. Data flows through 5 distinct pipeline stages. This layered structure keeps concerns separated and modules independent.

How does data flow through keras?

Data moves through 5 stages: Scan source for exports → Generate public API → Select backend at runtime → Compile model operations → Execute forward/backward passes. The system operates in two main flows: API generation where source code is scanned for @keras_export decorators to build the public interface, and runtime execution where user models compiled through the unified API are translated to backend-specific operations and executed on the chosen engine (TensorFlow/JAX/PyTorch). This pipeline design reflects a complex multi-stage processing system.

What technologies does keras use?

The core stack includes namex (Scans Python source code for decorator patterns and generates API surface mappings during build process), TensorFlow (One of four supported backends for executing neural network operations, providing GPU acceleration and distributed training), JAX (High-performance backend option for neural network execution, often fastest for certain model architectures), PyTorch (Backend providing eager execution and debugging capabilities for neural network operations), absl (Provides command-line flag parsing and logging for benchmark scripts and development tools), h5py (Handles model serialization and checkpoint saving/loading in HDF5 format), and 1 more. A focused set of dependencies that keeps the build manageable.

What system dynamics does keras have?

keras exhibits 3 data pools (Model weights registry, Backend operation cache), 2 feedback loops, 4 control points, 3 delays. The feedback loops handle auto-scale and self-correction. These runtime behaviors shape how the system responds to load, failures, and configuration changes.

What design patterns does keras use?

4 design patterns detected: Multi-backend abstraction, Decorator-based API export, Benchmark-driven optimization, GitHub automation.

Analyzed on April 20, 2026 by CodeSea. Written by Karolina Sarna.