Hidden Assumptions in iris

13 assumptions this code never checks · 5 critical · spanning Contract, Temporal, Resource, Domain, Environment, Ordering, Scale

Every codebase relies on things it never checks. Most of them are routine. CodeSea looked at scitools/iris and picked out the few most likely to cause trouble. The full list is just below.

Most of what this code assumes is routine. These 3 are the ones most likely to cause trouble here. The rest are minor; they're under "Show everything".

Worth your attention first

Data generation functions will fail with import errors or missing dependencies at runtime, corrupting benchmark datasets or causing benchmark failures

Worth your attention first

Mid-benchmark failures when the external environment becomes unavailable, requiring full benchmark restart and invalidating timing comparisons across commits

Worth your attention first

Data generation fails with AttributeError when checking out older Iris commits that don't have expected stock functions, breaking benchmark runs across commit history

Show everything (10 more)

Resource

System has sufficient memory to load UM files with shape (1920, 2560) of float32 data (~19MB per cube) plus coordinate arrays without memory pressure

If this fails: Silent memory swapping causes benchmark timing to include disk I/O, making results unreliable, or OOM kills benchmark process

benchmarks/benchmarks/cperf/__init__.py:_UM_DIMS_YX

Domain

UM files always load longitude/latitude as DimCoords (which are always realized) while LFRic files load them as MeshCoords (which are lazy by default)

If this fails: Benchmark assertions fail if file format behavior changes, and timing comparisons become invalid if coordinate realization strategy differs between formats

benchmarks/benchmarks/cperf/load.py:time_load

Ordering

Coordinate dimensions returned by c.cube_dims(source_cube) remain stable throughout the lifetime of the benchmark setup and match the source_cube's dimensional structure

If this fails: Cube construction fails with dimension mismatch errors if source cube's coordinate mapping changes between setup and benchmark execution

benchmarks/benchmarks/cube.py:setup

Temporal

Previously generated benchmark data files remain valid and compatible with current Iris version when REUSE_DATA is enabled

If this fails: Benchmarks use stale data that doesn't match current Iris behavior, producing misleading performance measurements or silent failures due to format incompatibilities

benchmarks/benchmarks/generate_data/__init__.py:REUSE_DATA

Scale

The cubesphere size calculation int(np.sqrt(np.prod(_UM_DIMS_YX) / 6)) produces a valid cubesphere dimension that can be handled by LFRic mesh generation

If this fails: Mesh generation fails when calculated cubesphere size exceeds implementation limits or produces invalid mesh topology, causing benchmark crashes

benchmarks/benchmarks/cperf/__init__.py:_N_CUBESPHERE_UM_EQUIVALENT

Resource

Object persistence between ASV repeat runs behaves consistently - objects created in setup() will remain modified after first benchmark run

If this fails: Subsequent benchmark runs operate on already-modified objects, producing invalid timing measurements that don't reflect real-world performance

benchmarks/benchmarks/aggregate_collapse.py:disable_repeat_between_setup

Environment

The checked-out commit of Iris contains parseable setup.py/pyproject.toml with standard Python packaging metadata for dependency extraction

If this fails: Environment preparation fails when checking out commits with non-standard build configurations, breaking benchmark runs for historical commits

benchmarks/asv_delegated.py:_prep_env_override

Contract

The current working directory is writable and has sufficient disk space for temporary 'tmp.nc' files during benchmark execution

If this fails: Save benchmarks fail with permission denied or disk full errors, and multiple concurrent benchmarks may overwrite each other's tmp.nc files

benchmarks/benchmarks/cperf/save.py:_save_data

Domain

The sample_mesh and sample_mesh_cube functions from iris.tests.stock.mesh can generate valid UGRID meshes for arbitrary n_cube sizes without topology constraints

If this fails: Mesh generation produces invalid or degenerate mesh topology for certain cube sizes, leading to benchmark failures or unrealistic performance measurements

benchmarks/benchmarks/generate_data/ugrid.py:generate_cube_like_2d_cubesphere

Temporal

Calling compute() on lazy coordinate points/bounds during benchmark execution doesn't permanently modify the coordinate objects for subsequent ASV reruns

If this fails: First benchmark run realizes coordinates, subsequent runs measure performance of already-realized coordinates, producing inconsistent and misleading timing results

benchmarks/benchmarks/cperf/load.py:time_load_w_realised_coords

See the full structural analysis of iris: the pipeline, data models, and system behavior that put these assumptions in context.

Full analysis of scitools/iris →

Frequently Asked Questions

What does iris assume that could break in production?

The one most likely to cause trouble: The DATA_GEN_PYTHON environment variable points to a Python executable that has all required dependencies (this repo, Mule, test modules) installed in its environment If this fails, Data generation functions will fail with import errors or missing dependencies at runtime, corrupting benchmark datasets or causing benchmark failures

How many hidden assumptions does iris have?

CodeSea found 13 assumptions iris relies on but never validates, 5 of them critical, spanning Contract, Temporal, Resource, Domain, Environment, Ordering, Scale. Most are routine — the analysis flags the two or three most likely to actually bite.

What is a hidden assumption?

Something the code depends on but never checks: a data shape, an ordering, an environment condition, a scale limit, or a contract with another service. It holds until the world it runs in changes, then fails silently.