google/flax

Flax is a neural network library for JAX that is designed for flexibility.

7,141 stars Jupyter Notebook 13 components 15 connections

Neural network library for JAX with two APIs: NNX and Linen

Training data flows through model forward pass, loss computation, gradient calculation, and parameter updates using JAX transformations

Under the hood, the system uses 3 feedback loops, 4 data pools, 4 control points to manage its runtime behavior.

Structural Verdict

A 13-component ml training with 15 connections. 343 files analyzed. Well-connected — clear data flow between components.

How Data Flows Through the System

Training data flows through model forward pass, loss computation, gradient calculation, and parameter updates using JAX transformations

  1. Data Loading — Load and preprocess training batches from datasets (config: per_device_batch_size, dataset_name)
  2. Forward Pass — Pass input through neural network layers to compute predictions
  3. Loss Calculation — Compute loss between predictions and ground truth labels
  4. Gradient Computation — Use JAX grad to compute gradients of loss with respect to parameters
  5. Parameter Update — Apply optimizer (AdamW, SGD) to update model parameters using gradients
  6. Checkpointing — Save model state and training progress to disk periodically

System Behavior

How the system actually operates at runtime — where data accumulates, what loops, what waits, and what controls what.

Data Pools

Model Parameters (state-store)
Neural network weights and biases stored in Variable objects
Optimizer State (state-store)
Momentum, learning rate schedules, and other optimizer internal state
Checkpoints (file-store)
Serialized model and training state saved to disk
Batch Statistics (state-store)
Running means and variances for batch normalization layers

Feedback Loops

Delays & Async Processing

Control Points

Technology Stack

JAX (framework)
Core computation framework for automatic differentiation and JIT compilation
NumPy (library)
Numerical computing foundation and array operations
Optax (library)
Gradient-based optimization algorithms like Adam and SGD
Orbax (library)
Checkpointing and model serialization
TensorStore (library)
Efficient storage and loading of large arrays
ml_collections (library)
Configuration management for hyperparameters
pytest (testing)
Unit testing framework
setuptools (build)
Package building and distribution

Key Components

Sub-Modules

NNX Neural Network API (independence: medium)
Object-oriented neural network library with mutable state and Python reference semantics
Linen Neural Network API (independence: medium)
Functional neural network library with immutable state and explicit parameter threading
Training Examples (independence: high)
Complete training scripts demonstrating Flax usage across different ML domains
Performance Benchmarks (independence: high)
Performance measurement and comparison tools for different training scenarios

Configuration

examples/gemma/configs/default.py (python-dataclass)

examples/gemma/configs/gemma3_4b.py (python-dataclass)

examples/gemma/configs/small.py (python-dataclass)

examples/gemma/configs/tiny.py (python-dataclass)

Science Pipeline

  1. Input Preprocessing — Tokenization and sequence padding for text, normalization for images [Variable batch sizes → (batch_size, sequence_length) or (batch_size, height, width, channels)] examples/*/input_pipeline.py
  2. Embedding Lookup — Convert token IDs to dense embeddings [(batch_size, sequence_length) → (batch_size, sequence_length, embed_dim)] examples/gemma/transformer.py
  3. Transformer Forward — Apply attention layers and feed-forward networks [(batch_size, sequence_length, embed_dim) → (batch_size, sequence_length, embed_dim)] examples/gemma/transformer.py
  4. Output Projection — Project hidden states to vocabulary logits [(batch_size, sequence_length, embed_dim) → (batch_size, sequence_length, vocab_size)] examples/gemma/transformer.py
  5. Loss Computation — Cross-entropy loss between logits and target tokens [(batch_size, sequence_length, vocab_size) → scalar loss] examples/*/train.py

Assumptions & Constraints

Explore the interactive analysis

See the full architecture map, data flow, and code patterns visualization.

Analyze on CodeSea

Related Ml Training Repositories

Frequently Asked Questions

What is flax used for?

Neural network library for JAX with two APIs: NNX and Linen google/flax is a 13-component ml training written in Jupyter Notebook. Well-connected — clear data flow between components. The codebase contains 343 files.

How is flax architected?

flax is organized into 5 architecture layers: Core, NNX API, Linen API, Examples, and 1 more. Well-connected — clear data flow between components. This layered structure enables tight integration between components.

How does data flow through flax?

Data moves through 6 stages: Data Loading → Forward Pass → Loss Calculation → Gradient Computation → Parameter Update → .... Training data flows through model forward pass, loss computation, gradient calculation, and parameter updates using JAX transformations This pipeline design reflects a complex multi-stage processing system.

What technologies does flax use?

The core stack includes JAX (Core computation framework for automatic differentiation and JIT compilation), NumPy (Numerical computing foundation and array operations), Optax (Gradient-based optimization algorithms like Adam and SGD), Orbax (Checkpointing and model serialization), TensorStore (Efficient storage and loading of large arrays), ml_collections (Configuration management for hyperparameters), and 2 more. A focused set of dependencies that keeps the build manageable.

What system dynamics does flax have?

flax exhibits 4 data pools (Model Parameters, Optimizer State), 3 feedback loops, 4 control points, 3 delays. The feedback loops handle convergence and training-loop. These runtime behaviors shape how the system responds to load, failures, and configuration changes.

What design patterns does flax use?

5 design patterns detected: Module System Duality, Graph Splitting, Transform Wrappers, Variable Collections, Config-Driven Examples.

Analyzed on March 31, 2026 by CodeSea. Written by .