sharkdp/fd

A simple, fast and user-friendly alternative to 'find'

42,613 stars Rust 9 components

Searches filesystem paths using regex patterns with parallel traversal

User arguments flow through clap parsing into a Config struct containing compiled regex patterns and filter settings. The walk() function spawns parallel threads that traverse directories using the ignore crate, respecting gitignore rules and sending discovered paths as WorkerResult enums through crossbeam channels. Each path flows through a filter pipeline checking regex patterns, file types, size, time, and ownership constraints. Surviving paths are either formatted with colors and templates for stdout output, or used to execute user commands with placeholder substitution.

Under the hood, the system uses 3 data pools, 5 control points to manage its runtime behavior.

A 9-component cli tool. 23 files analyzed. Data flows through 5 distinct pipeline stages.

How Data Flows Through the System

User arguments flow through clap parsing into a Config struct containing compiled regex patterns and filter settings. The walk() function spawns parallel threads that traverse directories using the ignore crate, respecting gitignore rules and sending discovered paths as WorkerResult enums through crossbeam channels. Each path flows through a filter pipeline checking regex patterns, file types, size, time, and ownership constraints. Surviving paths are either formatted with colors and templates for stdout output, or used to execute user commands with placeholder substitution.

  1. Parse CLI arguments into configuration — Opts::parse() uses clap to validate command-line arguments, then run() function builds Config struct by compiling regex patterns with RegexBuilder, creating file type filters from --type options, and parsing size/time constraints
  2. Walk filesystem in parallel threads — walk() function creates ignore::WalkParallel with configured thread count, applies gitignore and fdignore rules, spawns worker threads that traverse directory trees and send WorkerResult::Entry or WorkerResult::Error through crossbeam channels [Config → WorkerResult] (config: threads, ignore_hidden, read_vcsignore)
  3. Filter paths through constraint pipeline — Each WorkerResult flows through pattern.is_match() for regex matching, file_types.should_ignore() for type filtering, size_limits.filter() for size constraints, time_constraints.apply() for modification time bounds, and owner_filter.matches() for Unix ownership validation [WorkerResult → DirEntry] (config: pattern, file_types, size_limits +1)
  4. Execute commands on matched paths — If --exec or --exec-batch specified, CommandSet.execute() substitutes path placeholders in command templates using FormatTemplate.generate(), spawns processes with std::process::Command, captures output and merges exit codes [DirEntry → Command output] (config: command, path_separator, null_separator)
  5. Format output with colors and templates — print_entry() applies LsColors styling based on file types, substitutes FormatTemplate placeholders with actual path components using basename/dirname functions, generates terminal hyperlinks if enabled, and writes with null or newline separation [DirEntry → Format output] (config: format, ls_colors, hyperlink +1)

Data Models

The data structures that flow between stages — the contracts that hold the system together.

Config src/config.rs
struct with case_sensitive: bool, full_path_base: Option<PathBuf>, ignore_hidden: bool, read_fdignore: bool, pattern: Arc<regex::bytes::Regex>, ls_colors: Option<LsColors>, format: Option<FormatTemplate>, threads: usize, null_separator: bool, and many more search configuration fields
Built from CLI arguments in main(), passed immutably to all processing components, contains compiled regex patterns and filter configurations
DirEntry src/dir_entry.rs
struct wrapping ignore::DirEntry or PathBuf for broken symlinks, with cached metadata: OnceCell<Option<Metadata>> and style: OnceCell<Option<Style>>
Created during directory traversal, passed through filter chain, consumed by output formatter or command executor
FormatTemplate src/fmt/mod.rs
enum with Tokens(Vec<Token>) containing placeholders like Basename, Parent, NoExt, or Text(String) for fixed output
Parsed from user format string, applied to each DirEntry to generate final output paths with substitutions
CommandSet src/exec/mod.rs
struct with mode: ExecutionMode (OneByOne or Batch) and commands: Vec<CommandTemplate> containing parsed command templates with placeholders
Built from --exec or --exec-batch arguments, validates placeholder usage, executes system commands with path substitution
WorkerResult src/walk.rs
enum with Entry(DirEntry) for successful path discovery or Error(ignore::Error) for filesystem access failures
Sent from walker threads through crossbeam channels, processed by filter pipeline or error handlers

Hidden Assumptions

Things this code relies on but never validates. These are the things that cause silent failures when the system changes.

critical Contract unguarded

Command template arguments contain valid UTF-8 strings when converted with as_ref() but never validates encoding - assumes all filesystem paths and command strings are valid Unicode

If this fails: If a filesystem path contains invalid UTF-8 bytes (common on Unix systems), command template parsing silently produces malformed strings or panics during command substitution

src/exec/mod.rs:CommandTemplate::new
warning Contract weakly guarded

Batch commands assume the first argument (args[0]) is always a valid executable path but only checks has_tokens() - never validates the executable exists or is executable

If this fails: Batch mode will spawn processes that immediately fail with 'command not found' errors, but validation happens at execution time rather than argument parsing

src/exec/mod.rs:CommandSet::new_batch
warning Shape unguarded

Format string parsing assumes '{' and '}' characters have equal UTF-8 byte lengths (BRACE_LEN constant) but this is only true for ASCII braces

If this fails: If format strings somehow contain Unicode lookalike brace characters, string slicing will panic with 'byte index not on char boundary' errors

src/fmt/mod.rs:FormatTemplate::parse
warning Environment unguarded

Test environment assumes CARGO_BIN_EXE_fd environment variable points to a valid executable file but never validates the file exists or is executable

If this fails: Integration tests fail with obscure 'No such file or directory' errors if the environment variable points to a non-existent binary or build artifacts are corrupted

src/main.rs:find_fd_exe (tests)
critical Resource unguarded

Jemalloc allocator selection uses compile-time feature flags and platform detection but assumes the target system has sufficient virtual memory for jemalloc's memory mapping strategy

If this fails: On memory-constrained systems or containers with strict memory limits, jemalloc may fail to allocate large virtual memory regions, causing fd to crash with out-of-memory errors where the system allocator would succeed

src/main.rs:jemalloc configuration
critical Contract weakly guarded

Config assumes compiled regex patterns in Arc<Regex> remain valid for the entire program lifetime but never validates that regex compilation succeeded or handles regex engine limits

If this fails: If regex patterns exceed internal complexity limits or contain constructs that cause compilation to fail after Config creation, shared regex access across threads produces undefined behavior or panics

src/config.rs:Config struct
warning Domain weakly guarded

Owner filter parsing assumes Unix user/group name resolution will always succeed for valid names but user/group databases can be unavailable or inconsistent

If this fails: When /etc/passwd or LDAP is unavailable, or when running in containers with different user namespaces, owner filters fail with unclear error messages instead of gracefully degrading

src/filter/owner.rs:OwnerFilter::from_string
info Scale unguarded

Size filter uses hardcoded multiplier constants (TERA = 1000^4) that assume file sizes fit in u64, but on systems with 128-bit filesystems or future storage, this creates an artificial 16 exabyte limit

If this fails: Files larger than u64::MAX bytes (18 EB) cause size filter arithmetic to overflow silently, producing incorrect size comparisons for very large files

src/filter/size.rs:SizeFilter constants
warning Temporal weakly guarded

Time parsing assumes system timezone database is available and consistent but never handles timezone data corruption or missing zoneinfo files

If this fails: On systems with corrupted tzdata or in containers without timezone information, time filter parsing panics or produces incorrect timestamp comparisons, making time-based searches unreliable

src/filter/time.rs:TimeFilter::from_str
warning Environment weakly guarded

Terminal hyperlink generation assumes stdout is connected to a terminal that supports OSC 8 escape sequences but never validates terminal capabilities

If this fails: When output is redirected to files or piped to programs that don't handle escape sequences, hyperlink codes appear as literal garbage text in the output stream

src/output.rs:print_entry

System Behavior

How the system operates at runtime — where data accumulates, what loops, what waits, and what controls what.

Data Pools

Compiled regex patterns (in-memory)
Arc-wrapped compiled Regex objects shared across worker threads to avoid recompilation overhead during path matching
Worker result channels (queue)
Crossbeam channels that buffer discovered filesystem paths from parallel walker threads before filter processing
Cached file metadata (cache)
OnceCell containers in DirEntry that lazily cache filesystem metadata and color styling to avoid redundant system calls

Delays

Control Points

Technology Stack

ignore (library)
Provides parallel directory traversal with built-in gitignore and VCS ignore file parsing, handling all the complexity of filesystem walking and ignore rule application
regex (library)
Compiles user search patterns into optimized finite automata for fast path matching during filesystem traversal
clap (library)
Parses command-line arguments with automatic help generation, argument validation, and shell completion support using derive macros
crossbeam-channel (library)
Provides high-performance multi-producer single-consumer channels for passing discovered filesystem paths between walker threads and filter processing
lscolors (library)
Interprets LS_COLORS environment variable and applies appropriate ANSI color codes based on file types and extensions
tikv-jemallocator (runtime)
Replaces system allocator with jemalloc on supported platforms for better memory allocation performance during intensive filesystem operations
aho-corasick (library)
Efficiently matches multiple placeholder patterns simultaneously in format string parsing without backtracking
jiff (library)
Parses time expressions and performs date arithmetic for time-based filtering constraints like 'modified within last week'

Key Components

Explore the interactive analysis

See the full architecture map, data flow, and code patterns visualization.

Analyze on CodeSea

Compare fd

Related Cli Tool Repositories

Frequently Asked Questions

What is fd used for?

Searches filesystem paths using regex patterns with parallel traversal sharkdp/fd is a 9-component cli tool written in Rust. Data flows through 5 distinct pipeline stages. The codebase contains 23 files.

How is fd architected?

fd is organized into 5 architecture layers: CLI Interface, Configuration & Validation, Parallel Walker, Filtering Pipeline, and 1 more. Data flows through 5 distinct pipeline stages. This layered structure keeps concerns separated and modules independent.

How does data flow through fd?

Data moves through 5 stages: Parse CLI arguments into configuration → Walk filesystem in parallel threads → Filter paths through constraint pipeline → Execute commands on matched paths → Format output with colors and templates. User arguments flow through clap parsing into a Config struct containing compiled regex patterns and filter settings. The walk() function spawns parallel threads that traverse directories using the ignore crate, respecting gitignore rules and sending discovered paths as WorkerResult enums through crossbeam channels. Each path flows through a filter pipeline checking regex patterns, file types, size, time, and ownership constraints. Surviving paths are either formatted with colors and templates for stdout output, or used to execute user commands with placeholder substitution. This pipeline design reflects a complex multi-stage processing system.

What technologies does fd use?

The core stack includes ignore (Provides parallel directory traversal with built-in gitignore and VCS ignore file parsing, handling all the complexity of filesystem walking and ignore rule application), regex (Compiles user search patterns into optimized finite automata for fast path matching during filesystem traversal), clap (Parses command-line arguments with automatic help generation, argument validation, and shell completion support using derive macros), crossbeam-channel (Provides high-performance multi-producer single-consumer channels for passing discovered filesystem paths between walker threads and filter processing), lscolors (Interprets LS_COLORS environment variable and applies appropriate ANSI color codes based on file types and extensions), tikv-jemallocator (Replaces system allocator with jemalloc on supported platforms for better memory allocation performance during intensive filesystem operations), and 2 more. A focused set of dependencies that keeps the build manageable.

What system dynamics does fd have?

fd exhibits 3 data pools (Compiled regex patterns, Worker result channels), 5 control points, 3 delays. These runtime behaviors shape how the system responds to load, failures, and configuration changes.

What design patterns does fd use?

4 design patterns detected: Parallel Pipeline, Lazy Caching, Template Substitution, Builder Configuration.

How does fd compare to alternatives?

CodeSea has side-by-side architecture comparisons of fd with bat. These comparisons show tech stack differences, pipeline design, system behavior, and code patterns. See the comparison pages above for detailed analysis.

Analyzed on April 20, 2026 by CodeSea. Written by .