Hidden Assumptions in matomo

13 assumptions this code never checks · 5 critical · spanning Contract, Temporal, Scale, Shape, Resource, Ordering, Domain, Environment

Every codebase relies on things it never checks. Most of them are routine. CodeSea looked at matomo-org/matomo and picked out the few most likely to cause trouble. The full list is just below.

Most of what this code assumes is routine. These 3 are the ones most likely to cause trouble here. The rest are minor; they're under "Show everything".

Worth your attention first

Missing idsite parameter would cause database insertion failures or tracking data attributed to wrong sites, while missing rec parameter might skip tracking entirely without clear error messages

Worth your attention first

Multiple CronArchive instances could process the same data simultaneously, leading to duplicate calculations, race conditions in archive table updates, or incomplete/corrupted aggregated reports

Worth your attention first

Sites with millions of daily visits would cause archiving queries to exceed PHP memory limits or MySQL query timeouts, resulting in incomplete archives and missing report data

Show everything (10 more)
Shape

Archive data retrieved from archive_blob tables contains properly serialized DataTable objects with expected column structure (label, nb_visits, nb_actions, etc.)

If this fails: Corrupted or differently-structured archived data would cause report generation to fail silently or display wrong metrics, especially after plugin updates that change report schemas

plugins/*/Reports/*.php:DataTable construction
Resource

MySQL database remains available and responsive throughout request processing, with no connection pooling or retry logic for temporary network issues

If this fails: Database connection drops during long-running archiving jobs would cause partial data loss and require manual recovery, while connection issues during tracking would result in lost visitor data

core/Db/Adapter.php:MySQL connection handling
Ordering

Archive invalidation events are processed in the order they're created, ensuring dependent archives are recalculated after their dependencies

If this fails: Out-of-order invalidation could cause child archives to be recalculated with stale parent data, leading to inconsistent report hierarchies and incorrect drill-down analytics

core/Archive/ArchiveInvalidator.php:invalidation processing
Domain

All visitor IP addresses are valid IPv4/IPv6 addresses that exist in the GeoIP2 database, without handling for private networks, VPNs, or proxy servers

If this fails: Corporate users behind NAT or VPN would be geolocated to incorrect countries, while IPv6 addresses might fail lookup entirely, skewing geographic reports

plugins/GeoIp2/:IP geolocation
Contract

Plugin classes contain methods matching the API request format (PluginName.methodName) and return DataTable objects or primitive values

If this fails: Plugin methods that return unexpected types or throw exceptions would cause API responses to fail without helpful error messages, breaking dashboard widgets and report displays

core/API/Proxy.php:plugin method routing
Environment

All plugin directories contain valid plugin.json files with required metadata (name, version, php version) and PHP files are syntactically correct

If this fails: Malformed plugin files would cause the entire plugin system to fail loading, potentially breaking the entire Matomo installation during startup

core/Plugin/Manager.php:plugin loading
Temporal

User's browser clock is reasonably accurate for timestamp generation, and tracking requests are sent within a reasonable time window of the actual page view

If this fails: Users with significantly wrong system clocks would generate tracking data with incorrect timestamps, skewing time-based reports and making visitor session reconstruction unreliable

matomo.js:JavaScript tracking
Scale

DataTable objects contain reasonable numbers of rows that can be processed in memory for sorting, filtering, and formatting operations

If this fails: Reports with hundreds of thousands of rows (like detailed page URLs or search keywords) would consume excessive memory during filtering, potentially causing PHP fatal errors

core/DataTable/DataTable.php:filtering operations
Shape

Vue components receive props matching the expected VisualizationData structure from API responses, with specific report metadata properties

If this fails: API schema changes or missing metadata properties would cause Vue components to render incorrectly or crash, breaking dashboard visualizations without clear debugging information

plugins/*/vue/src/index.ts:Vue component exports
Domain

User-Agent strings follow standard browser/bot patterns that can be reliably classified using DeviceDetector, without considering new bot types or user-agent spoofing

If this fails: New bots or sophisticated scrapers with human-like user-agents would be counted as legitimate visitors, inflating visitor metrics and skewing analytics data

core/Tracker/:bot detection

See the full structural analysis of matomo: the pipeline, data models, and system behavior that put these assumptions in context.

Full analysis of matomo-org/matomo →

Frequently Asked Questions

What does matomo assume that could break in production?

The one most likely to cause trouble: JavaScript tracking code always sends HTTP requests with required parameters (idsite, rec=1) and proper encoding, but Tracker doesn't validate parameter presence before processing If this fails, Missing idsite parameter would cause database insertion failures or tracking data attributed to wrong sites, while missing rec parameter might skip tracking entirely without clear error messages

How many hidden assumptions does matomo have?

CodeSea found 13 assumptions matomo relies on but never validates, 5 of them critical, spanning Contract, Temporal, Scale, Shape, Resource, Ordering, Domain, Environment. Most are routine — the analysis flags the two or three most likely to actually bite.

What is a hidden assumption?

Something the code depends on but never checks: a data shape, an ordering, an environment condition, a scale limit, or a contract with another service. It holds until the world it runs in changes, then fails silently.