bmaltais/kohya_ss
Gradio GUI for training Stable Diffusion LoRA models and fine-tuning
User configures training parameters through Gradio interface, settings are saved to TOML config files, then converted to command-line arguments that execute external training scripts from the sd-scripts library.
Under the hood, the system uses 2 feedback loops, 3 data pools, 4 control points to manage its runtime behavior.
Structural Verdict
A 12-component fullstack with 9 connections. 86 files analyzed. Well-connected — clear data flow between components.
How Data Flows Through the System
User configures training parameters through Gradio interface, settings are saved to TOML config files, then converted to command-line arguments that execute external training scripts from the sd-scripts library.
- Parameter Configuration — User sets training parameters through Gradio web interface (config: accelerate_launch.mixed_precision, accelerate_launch.num_processes)
- Config Management — Settings saved to and loaded from TOML configuration files
- Command Generation — GUI parameters converted to command-line arguments
- Training Execution — External sd-scripts training scripts executed with generated parameters
- Progress Monitoring — Optional sample image generation and logging during training
System Behavior
How the system actually operates at runtime — where data accumulates, what loops, what waits, and what controls what.
Data Pools
TOML configuration files storing user preferences and training parameters
Training progress logs and tensorboard data
Generated sample images during training for progress monitoring
Feedback Loops
- Training Progress (polling, balancing) — Trigger: Training execution starts. Action: Monitor process status and update UI. Exit: Training completes or user stops.
- Config Persistence (cache-invalidation, balancing) — Trigger: User changes parameters. Action: Save settings to TOML config. Exit: UI session ends.
Delays & Async Processing
- Model Training (async-processing, ~hours to days) — Long-running training process with periodic sample generation
- Sample Generation (scheduled-job, ~per training step interval) — Periodic image generation for progress monitoring
Control Points
- Mixed Precision (env-var) — Controls: Training precision and memory usage. Default: fp16
- Batch Size (runtime-toggle) — Controls: Training batch size and memory consumption
- Learning Rate (runtime-toggle) — Controls: Model learning rate and convergence speed. Default: 1e-6
- SDXL Cache (feature-flag) — Controls: Text encoder output caching for memory optimization. Default: False
Technology Stack
Web UI framework for machine learning applications
Deep learning framework for model training
Hugging Face library for pretrained models
PyTorch distributed training library
Diffusion model training and inference
Configuration file format
Safe tensor serialization format
Containerization for deployment
Key Components
- BasicTraining (class) — Configures core training parameters like learning rate, scheduler, and epochs
kohya_gui/class_basic_training.py - AdvancedTraining (class) — Handles advanced training options like gradient accumulation and token padding
kohya_gui/class_advanced_training.py - CommandExecutor (class) — Executes training commands and manages process lifecycle with start/stop controls
kohya_gui/class_command_executor.py - KohyaSSGUIConfig (class) — Loads and manages TOML configuration files for GUI settings
kohya_gui/class_gui_config.py - LoRATools (class) — Provides various LoRA manipulation tools like merging, extracting, and resizing
kohya_gui/class_lora_tab.py - SDXLParameters (class) — Manages SDXL-specific training parameters and caching options
kohya_gui/class_sdxl_parameters.py - flux1Training (class) — Handles Flux.1 model training parameters and VAE path configuration
kohya_gui/class_flux1.py - caption_images (function) — Generates automatic captions for training images using BLIP model
kohya_gui/blip_caption_gui.py - load_model (function) — Initializes BLIP2 processor and model for advanced image captioning
kohya_gui/blip2_caption_gui.py - Folders (class) — Manages directory paths for training data, outputs, and logging
kohya_gui/class_folders.py - SampleImages (class) — Creates sample images during training for progress monitoring
kohya_gui/class_sample_images.py - gradioApp (function) — Provides JavaScript utilities for Gradio UI manipulation and callbacks
assets/js/script.js
Configuration
docker-compose.yaml (yaml)
services.kohya-ss-gui.container_name(string, unknown) — default: kohya-ss-guiservices.kohya-ss-gui.image(string, unknown) — default: ghcr.io/bmaltais/kohya-ss-gui:latestservices.kohya-ss-gui.user(string, unknown) — default: 1000:0services.kohya-ss-gui.build.context(string, unknown) — default: .services.kohya-ss-gui.build.args(array, unknown) — default: UID=1000services.kohya-ss-gui.build.cache_from(array, unknown) — default: ghcr.io/bmaltais/kohya-ss-gui:cacheservices.kohya-ss-gui.build.cache_to(array, unknown) — default: type=inlineservices.kohya-ss-gui.ports(array, unknown) — default: 7860:7860- +11 more parameters
Science Pipeline
- Load Configuration — Read TOML files and populate GUI defaults
kohya_gui/class_gui_config.py - Parameter Setup — User configures training parameters through Gradio interface
kohya_gui/class_basic_training.py - Image Captioning — BLIP/BLIP2 models generate captions for training images [(batch, 3, H, W) → text strings]
kohya_gui/blip2_caption_gui.py - Command Generation — Convert GUI parameters to sd-scripts command line arguments
kohya_gui/class_command_executor.py - Model Training — Execute external sd-scripts with generated parameters [(N, C, H, W) images + text captions → trained model weights]
kohya_gui/class_command_executor.py
Assumptions & Constraints
- [warning] Assumes CUDA availability for GPU training but falls back to CPU/MPS without explicit tensor device checks (device)
- [critical] SDXL VAE assumes no NaN production but provides override flag due to known issues (dependency)
- [info] Learning rate defaults to 1e-6 but no validation ensures it stays within reasonable bounds (value-range)
Explore the interactive analysis
See the full architecture map, data flow, and code patterns visualization.
Analyze on CodeSeaRelated Fullstack Repositories
Frequently Asked Questions
What is kohya_ss used for?
Gradio GUI for training Stable Diffusion LoRA models and fine-tuning bmaltais/kohya_ss is a 12-component fullstack written in Python. Well-connected — clear data flow between components. The codebase contains 86 files.
How is kohya_ss architected?
kohya_ss is organized into 4 architecture layers: GUI Classes, Training Executors, Utilities, Web Assets. Well-connected — clear data flow between components. This layered structure enables tight integration between components.
How does data flow through kohya_ss?
Data moves through 5 stages: Parameter Configuration → Config Management → Command Generation → Training Execution → Progress Monitoring. User configures training parameters through Gradio interface, settings are saved to TOML config files, then converted to command-line arguments that execute external training scripts from the sd-scripts library. This pipeline design reflects a complex multi-stage processing system.
What technologies does kohya_ss use?
The core stack includes Gradio (Web UI framework for machine learning applications), PyTorch (Deep learning framework for model training), Transformers (Hugging Face library for pretrained models), Accelerate (PyTorch distributed training library), Diffusers (Diffusion model training and inference), TOML (Configuration file format), and 2 more. A focused set of dependencies that keeps the build manageable.
What system dynamics does kohya_ss have?
kohya_ss exhibits 3 data pools (Config Files, Training Logs), 2 feedback loops, 4 control points, 2 delays. The feedback loops handle polling and cache-invalidation. These runtime behaviors shape how the system responds to load, failures, and configuration changes.
What design patterns does kohya_ss use?
4 design patterns detected: Class-per-Feature, Configuration Injection, Command Generation, Accordion UI.
Analyzed on March 31, 2026 by CodeSea. Written by Karolina Sarna.