Hidden Assumptions in openstatus

13 assumptions this code never checks · 4 critical · spanning Environment, Temporal, Resource, Scale, Contract, Ordering, Domain

Every codebase relies on things it never checks. Most of them are routine. CodeSea looked at openstatushq/openstatus and picked out the few most likely to cause trouble. The full list is just below.

Most of what this code assumes is routine. These 3 are the ones most likely to cause trouble here. The rest are minor; they're under "Show everything".

Worth your attention first

If the API key is invalid, expired, or lacks permissions, all monitor checks fail silently without alerting operators - the checker appears to run but produces no results

Worth your attention first

Screenshot capture fails silently during high incident volume when multiple Chromium instances exhaust container memory, leaving incidents without visual evidence

Worth your attention first

When Railway adds new regions or changes region identifiers, requests route to undefined targetUrl causing panic, taking down the entire proxy service

Show everything (10 more)

Temporal

Monitor configuration updates can wait up to 10 minutes to be picked up by checker agents

If this fails: Critical monitors added during outages won't be checked for up to 10 minutes, and disabled monitors continue running unnecessary checks, wasting resources and potentially triggering false alerts

apps/checker/cmd/private/main.go:configRefreshInterval

Contract

MonitorResult payloads from regional checkers always include cronTimestamp as Unix milliseconds in the same timezone

If this fails: If checkers send timestamps in different formats or timezones, incident timing becomes corrupted, causing false recovery notifications and incorrect SLA calculations

apps/workflows/src/checker/index.ts:payloadSchema

Ordering

Only one incident can be open per monitor at any given time

If this fails: If multiple incident creation requests race during rapid status changes, duplicate incidents are created but only one gets resolved, leaving phantom open incidents that block future incident creation

apps/workflows/src/checker/index.ts:findOpenIncident

Domain

Custom domain hostnames always have exactly 3+ segments separated by dots, with the subdomain as the first segment

If this fails: Status pages hosted on unusual domains (e.g., single-level domains, IPv6 addresses, or domains with multiple subdomain levels) are misrouted, showing wrong status pages or 404 errors to customers

apps/status-page/src/lib/resolve-route.ts:resolveRoute

Environment

OAuth profile objects from Google and GitHub providers always contain expected fields (given_name, family_name, picture, avatar_url)

If this fails: Authentication succeeds but user profile updates fail silently when OAuth providers change their response schema, leaving users with incomplete profiles and broken avatars

apps/dashboard/src/lib/auth/index.ts:signIn

Temporal

Screenshot filenames using Date.now() are globally unique across all incident captures

If this fails: Simultaneous incident screenshots for the same incident ID overwrite each other in R2 storage, leaving only the last screenshot and losing evidence of the incident progression

apps/screenshot-service/src/index.ts:Date.now

Resource

AXIOM_TOKEN environment variable provides unlimited log ingestion quota

If this fails: When Axiom quota is exceeded, all application logging silently stops without fallback, making debugging production issues impossible during high-traffic periods

apps/server/src/index.ts:configure

Contract

All cleanup operations complete within the hardcoded 5-second shutdown timeout

If this fails: Long-running monitor checks or database connections are forcibly terminated during deployment, potentially corrupting incident state or losing check results

apps/private-location/cmd/server/main.go:gracefulShutdown

Environment

CRON_SECRET environment variable is kept secret and never logged or exposed in error messages

If this fails: If the cron secret leaks, attackers can trigger expensive checker tasks and email campaigns, overwhelming external APIs and inflating cloud costs

apps/workflows/src/cron/index.ts:env().CRON_SECRET

Scale

20% sampling rate for successful requests is sufficient for observability without overwhelming log storage

If this fails: During incidents affecting successful requests, the low sampling rate may miss critical debugging information, while high-traffic services generate massive log volumes that exceed storage budgets

apps/checker/cmd/server/main.go:shouldSample

See the full structural analysis of openstatus: the pipeline, data models, and system behavior that put these assumptions in context.

Full analysis of openstatushq/openstatus →

Frequently Asked Questions

What does openstatus assume that could break in production?

The one most likely to cause trouble: OPENSTATUS_KEY environment variable contains a valid API key that never expires and has sufficient permissions If this fails, If the API key is invalid, expired, or lacks permissions, all monitor checks fail silently without alerting operators - the checker appears to run but produces no results

How many hidden assumptions does openstatus have?

CodeSea found 13 assumptions openstatus relies on but never validates, 4 of them critical, spanning Environment, Temporal, Resource, Scale, Contract, Ordering, Domain. Most are routine — the analysis flags the two or three most likely to actually bite.

What is a hidden assumption?

Something the code depends on but never checks: a data shape, an ordering, an environment condition, a scale limit, or a contract with another service. It holds until the world it runs in changes, then fails silently.