Hidden Assumptions in detectron2
11 assumptions this code never checks · 4 critical · spanning Shape, Domain, Contract, Scale, Resource, Temporal, Ordering, Environment
Every codebase relies on things it never checks. Most of them are routine. CodeSea looked at facebookresearch/detectron2 and picked out the few most likely to cause trouble. The full list is just below.
Most of what this code assumes is routine. These 3 are the ones most likely to cause trouble here. The rest are minor; they're under "Show everything".
If backbone produces feature maps with unexpected spatial ratios, FPN's lateral connections and top-down pathway will misalign features, causing detection heads to process spatially inconsistent features and produce wrong bounding box coordinates
If input contains HDR images, medical images with 16-bit depth, or pre-normalized images in [0,1] range, the normalization will push pixel values far outside expected ranges, causing backbone features to saturate and models to fail silently
If RPN generates out-of-bounds proposals due to anchor misconfiguration or image resizing bugs, ROIAlign will sample features from invalid memory locations, causing crashes or corrupted gradients during training
Show everything (8 more)
Training assumes balanced positive/negative anchor sampling with hardcoded ratios (positive_fraction=0.5, batch_size_per_image=256) but never checks if the dataset actually contains sufficient positive anchors
If this fails: On datasets with very small objects or sparse annotations, most images may have <128 positive anchors available, causing the sampler to pad with duplicate positives or fall back to fewer samples, leading to unstable gradients and poor convergence
detectron2/modeling/proposal_generator/rpn.py:RPN.forward_training
DataLoader assumes sufficient GPU memory to hold batch_size * max_image_size * num_workers worth of preprocessed images, but never estimates or validates memory requirements
If this fails: Large images (>2000px) or high batch sizes can cause CUDA out-of-memory errors that manifest as cryptic RuntimeError messages mid-training, losing hours of training progress without clear memory usage guidance
detectron2/engine/defaults.py:DefaultTrainer.build_train_loader
Checkpoint loading assumes model architecture hasn't changed between save and load - specifically that all parameter names and shapes match exactly
If this fails: If config changes backbone depth (ResNet50->ResNet101) or adds new heads between training runs, checkpoint loading fails with KeyError or shape mismatch, but error messages don't clearly indicate which architectural change caused the incompatibility
detectron2/checkpoint/checkpoint.py:Checkpointer.load
DataLoader iteration assumes dataset records can be accessed in any order via __getitem__(index), but some dataset implementations may expect sequential access or have stateful transforms
If this fails: Multi-worker data loading with random sampling can break datasets that maintain internal state or cache, causing inconsistent augmentations or corrupted batches that lead to training instability
detectron2/data/build.py:build_detection_train_loader
All configs hardcode pixel normalization constants (pixel_mean=[103.530, 116.280, 123.675]) assuming BGR channel order and ImageNet statistics, but never validate actual dataset statistics
If this fails: If dataset uses RGB order, different camera sensors, or domain-specific images (medical, satellite), the hardcoded normalization will shift the data distribution, causing pretrained features to activate incorrectly and reducing model accuracy
configs/common/models/mask_rcnn_fpn.py:model.pixel_mean
COCO evaluation assumes detection scores are well-calibrated probabilities in [0,1] range and uses fixed IoU thresholds (0.5:0.95) without checking score distribution
If this fails: Models that output uncalibrated confidence scores or use different output ranges may appear to perform poorly in evaluation even if spatial predictions are accurate, masking model quality issues
detectron2/evaluation/coco_evaluation.py:COCOEvaluator._eval_predictions
Image batching assumes all input tensors have the same number of channels (3 for RGB/BGR) and will pad spatial dimensions to match the largest image in the batch
If this fails: If batch contains grayscale (1-channel) or RGBA (4-channel) images mixed with RGB, tensor concatenation will fail with shape mismatch errors that don't clearly indicate the channel dimension issue
detectron2/structures/image_list.py:ImageList.from_tensors
Anchor generation assumes object scales follow COCO distribution with default sizes=[32, 64, 128, 256, 512] and aspect ratios=[0.5, 1.0, 2.0], but never adapts to actual dataset object statistics
If this fails: On datasets with very different object scales (e.g., microscopy with tiny objects or aerial imagery with large structures), the fixed anchor sizes will have poor recall, causing the detector to miss objects systematically
detectron2/modeling/anchor_generator.py:DefaultAnchorGenerator.forward
See the full structural analysis of detectron2: the pipeline, data models, and system behavior that put these assumptions in context.
Full analysis of facebookresearch/detectron2 →Frequently Asked Questions
What does detectron2 assume that could break in production?
The one most likely to cause trouble: All input feature maps from backbone have consistent spatial dimensions that align with expected strides (4, 8, 16, 32, 64) but FPN never validates that input['p2'] has height/width that is exactly 4x smaller than input['p1'] If this fails, If backbone produces feature maps with unexpected spatial ratios, FPN's lateral connections and top-down pathway will misalign features, causing detection heads to process spatially inconsistent features and produce wrong bounding box coordinates
How many hidden assumptions does detectron2 have?
CodeSea found 11 assumptions detectron2 relies on but never validates, 4 of them critical, spanning Shape, Domain, Contract, Scale, Resource, Temporal, Ordering, Environment. Most are routine — the analysis flags the two or three most likely to actually bite.
What is a hidden assumption?
Something the code depends on but never checks: a data shape, an ordering, an environment condition, a scale limit, or a contract with another service. It holds until the world it runs in changes, then fails silently.