Hidden Assumptions in fd

13 assumptions this code never checks · 3 critical · spanning Contract, Shape, Environment, Resource, Domain, Scale, Temporal, Ordering

Every codebase relies on things it never checks. Most of them are routine. CodeSea looked at sharkdp/fd and picked out the few most likely to cause trouble. The full list is just below.

Most of what this code assumes is routine. These 3 are the ones most likely to cause trouble here. The rest are minor; they're under "Show everything".

Worth your attention first

If a filesystem path contains invalid UTF-8 bytes (common on Unix systems), command template parsing silently produces malformed strings or panics during command substitution

Worth your attention first

On memory-constrained systems or containers with strict memory limits, jemalloc may fail to allocate large virtual memory regions, causing fd to crash with out-of-memory errors where the system allocator would succeed

Worth your attention first

If regex patterns exceed internal complexity limits or contain constructs that cause compilation to fail after Config creation, shared regex access across threads produces undefined behavior or panics

Show everything (10 more)
Contract

Batch commands assume the first argument (args[0]) is always a valid executable path but only checks has_tokens() - never validates the executable exists or is executable

If this fails: Batch mode will spawn processes that immediately fail with 'command not found' errors, but validation happens at execution time rather than argument parsing

src/exec/mod.rs:CommandSet::new_batch
Shape

Format string parsing assumes '{' and '}' characters have equal UTF-8 byte lengths (BRACE_LEN constant) but this is only true for ASCII braces

If this fails: If format strings somehow contain Unicode lookalike brace characters, string slicing will panic with 'byte index not on char boundary' errors

src/fmt/mod.rs:FormatTemplate::parse
Environment

Test environment assumes CARGO_BIN_EXE_fd environment variable points to a valid executable file but never validates the file exists or is executable

If this fails: Integration tests fail with obscure 'No such file or directory' errors if the environment variable points to a non-existent binary or build artifacts are corrupted

src/main.rs:find_fd_exe (tests)
Domain

Owner filter parsing assumes Unix user/group name resolution will always succeed for valid names but user/group databases can be unavailable or inconsistent

If this fails: When /etc/passwd or LDAP is unavailable, or when running in containers with different user namespaces, owner filters fail with unclear error messages instead of gracefully degrading

src/filter/owner.rs:OwnerFilter::from_string
Scale

Size filter uses hardcoded multiplier constants (TERA = 1000^4) that assume file sizes fit in u64, but on systems with 128-bit filesystems or future storage, this creates an artificial 16 exabyte limit

If this fails: Files larger than u64::MAX bytes (18 EB) cause size filter arithmetic to overflow silently, producing incorrect size comparisons for very large files

src/filter/size.rs:SizeFilter constants
Temporal

Time parsing assumes system timezone database is available and consistent but never handles timezone data corruption or missing zoneinfo files

If this fails: On systems with corrupted tzdata or in containers without timezone information, time filter parsing panics or produces incorrect timestamp comparisons, making time-based searches unreliable

src/filter/time.rs:TimeFilter::from_str
Environment

Terminal hyperlink generation assumes stdout is connected to a terminal that supports OSC 8 escape sequences but never validates terminal capabilities

If this fails: When output is redirected to files or piped to programs that don't handle escape sequences, hyperlink codes appear as literal garbage text in the output stream

src/output.rs:print_entry
Contract

Default color scheme assumes all terminal color codes are valid ANSI sequences but the string is hardcoded without validation of color code syntax or terminal compatibility

If this fails: If the hardcoded color string contains malformed ANSI codes, terminals display garbled colors or ignore styling entirely, making output unreadable on some terminal types

src/main.rs:DEFAULT_LS_COLORS string
Ordering

Parallel filesystem walking assumes crossbeam channels preserve some ordering relationship between discovered files but provides no ordering guarantees for results from different worker threads

If this fails: When users expect deterministic output ordering (like lexicographic sort), results appear in random thread-dependent order, making fd unsuitable for scripts that depend on consistent file ordering

src/walk.rs:WorkerResult channels
Resource

Cached file metadata in OnceCell assumes filesystem state remains stable during the program execution but never invalidates cache when files are modified concurrently

If this fails: If files are modified, deleted, or their permissions changed while fd is running, cached metadata becomes stale, leading to incorrect size/time/ownership filtering decisions

src/dir_entry.rs:OnceCell cached metadata

See the full structural analysis of fd: the pipeline, data models, and system behavior that put these assumptions in context.

Full analysis of sharkdp/fd →

Compare fd

Frequently Asked Questions

What does fd assume that could break in production?

The one most likely to cause trouble: Command template arguments contain valid UTF-8 strings when converted with as_ref() but never validates encoding - assumes all filesystem paths and command strings are valid Unicode If this fails, If a filesystem path contains invalid UTF-8 bytes (common on Unix systems), command template parsing silently produces malformed strings or panics during command substitution

How many hidden assumptions does fd have?

CodeSea found 13 assumptions fd relies on but never validates, 3 of them critical, spanning Contract, Shape, Environment, Resource, Domain, Scale, Temporal, Ordering. Most are routine — the analysis flags the two or three most likely to actually bite.

What is a hidden assumption?

Something the code depends on but never checks: a data shape, an ordering, an environment condition, a scale limit, or a contract with another service. It holds until the world it runs in changes, then fails silently.