Hidden Assumptions in openstatus
13 assumptions this code never checks · 4 critical · spanning Environment, Temporal, Resource, Scale, Contract, Ordering, Domain
Every codebase relies on things it never checks. Most of them are routine. CodeSea looked at openstatushq/openstatus and picked out the few most likely to cause trouble. The full list is just below.
Most of what this code assumes is routine. These 3 are the ones most likely to cause trouble here. The rest are minor; they're under "Show everything".
If the API key is invalid, expired, or lacks permissions, all monitor checks fail silently without alerting operators - the checker appears to run but produces no results
Screenshot capture fails silently during high incident volume when multiple Chromium instances exhaust container memory, leaving incidents without visual evidence
When Railway adds new regions or changes region identifiers, requests route to undefined targetUrl causing panic, taking down the entire proxy service
Show everything (10 more)
Monitor configuration updates can wait up to 10 minutes to be picked up by checker agents
If this fails: Critical monitors added during outages won't be checked for up to 10 minutes, and disabled monitors continue running unnecessary checks, wasting resources and potentially triggering false alerts
apps/checker/cmd/private/main.go:configRefreshInterval
MonitorResult payloads from regional checkers always include cronTimestamp as Unix milliseconds in the same timezone
If this fails: If checkers send timestamps in different formats or timezones, incident timing becomes corrupted, causing false recovery notifications and incorrect SLA calculations
apps/workflows/src/checker/index.ts:payloadSchema
Only one incident can be open per monitor at any given time
If this fails: If multiple incident creation requests race during rapid status changes, duplicate incidents are created but only one gets resolved, leaving phantom open incidents that block future incident creation
apps/workflows/src/checker/index.ts:findOpenIncident
Custom domain hostnames always have exactly 3+ segments separated by dots, with the subdomain as the first segment
If this fails: Status pages hosted on unusual domains (e.g., single-level domains, IPv6 addresses, or domains with multiple subdomain levels) are misrouted, showing wrong status pages or 404 errors to customers
apps/status-page/src/lib/resolve-route.ts:resolveRoute
OAuth profile objects from Google and GitHub providers always contain expected fields (given_name, family_name, picture, avatar_url)
If this fails: Authentication succeeds but user profile updates fail silently when OAuth providers change their response schema, leaving users with incomplete profiles and broken avatars
apps/dashboard/src/lib/auth/index.ts:signIn
Screenshot filenames using Date.now() are globally unique across all incident captures
If this fails: Simultaneous incident screenshots for the same incident ID overwrite each other in R2 storage, leaving only the last screenshot and losing evidence of the incident progression
apps/screenshot-service/src/index.ts:Date.now
AXIOM_TOKEN environment variable provides unlimited log ingestion quota
If this fails: When Axiom quota is exceeded, all application logging silently stops without fallback, making debugging production issues impossible during high-traffic periods
apps/server/src/index.ts:configure
All cleanup operations complete within the hardcoded 5-second shutdown timeout
If this fails: Long-running monitor checks or database connections are forcibly terminated during deployment, potentially corrupting incident state or losing check results
apps/private-location/cmd/server/main.go:gracefulShutdown
CRON_SECRET environment variable is kept secret and never logged or exposed in error messages
If this fails: If the cron secret leaks, attackers can trigger expensive checker tasks and email campaigns, overwhelming external APIs and inflating cloud costs
apps/workflows/src/cron/index.ts:env().CRON_SECRET
20% sampling rate for successful requests is sufficient for observability without overwhelming log storage
If this fails: During incidents affecting successful requests, the low sampling rate may miss critical debugging information, while high-traffic services generate massive log volumes that exceed storage budgets
apps/checker/cmd/server/main.go:shouldSample
See the full structural analysis of openstatus: the pipeline, data models, and system behavior that put these assumptions in context.
Full analysis of openstatushq/openstatus →Frequently Asked Questions
What does openstatus assume that could break in production?
The one most likely to cause trouble: OPENSTATUS_KEY environment variable contains a valid API key that never expires and has sufficient permissions If this fails, If the API key is invalid, expired, or lacks permissions, all monitor checks fail silently without alerting operators - the checker appears to run but produces no results
How many hidden assumptions does openstatus have?
CodeSea found 13 assumptions openstatus relies on but never validates, 4 of them critical, spanning Environment, Temporal, Resource, Scale, Contract, Ordering, Domain. Most are routine — the analysis flags the two or three most likely to actually bite.
What is a hidden assumption?
Something the code depends on but never checks: a data shape, an ordering, an environment condition, a scale limit, or a contract with another service. It holds until the world it runs in changes, then fails silently.