Hidden Assumptions in superset

13 assumptions this code never checks · 3 critical · spanning Environment, Contract, Domain, Resource, Shape, Ordering, Scale, Temporal

Every codebase relies on things it never checks. Most of them are routine. CodeSea looked at apache/superset and picked out the few most likely to cause trouble. The full list is just below.

Most of what this code assumes is routine. These 3 are the ones most likely to cause trouble here. The rest are minor; they're under "Show everything".

Worth your attention first

If npm returns unexpected version format or is aliased to different tool, semver.compare() will crash with parsing error instead of graceful failure message

Worth your attention first

Extension code using DatasetDAO.find_all() will get AttributeError at runtime if host fails to properly inject implementations, breaking all extensions that depend on data access

Worth your attention first

Calling get_session() returns None if host hasn't initialized database session, causing all database operations to fail silently or with confusing errors

Show everything (10 more)
Domain

Assumes extensions use semantic versioning with exactly 3 numeric components (major.minor.patch) but many real projects use 4-component versions like '1.2.3.4' or pre-release identifiers like '1.0.0-beta'

If this fails: Extension with version '1.0.0-alpha' or '2.1.0.1' fails validation with confusing regex mismatch error instead of helpful version format message

superset-core/src/superset_core/extensions/constants.py:VERSION_PATTERN
Resource

Assumes npm install and build commands complete within reasonable time limits, but runs subprocess.run() with no timeout parameter

If this fails: Build process can hang indefinitely if npm registry is slow or build scripts have infinite loops, blocking CLI tool without any way to recover except process kill

superset-extensions-cli/src/superset_extensions_cli/cli.py:build_frontend
Shape

Assumes extension.json contains valid JSON that matches ExtensionConfig schema, but only validates the schema without checking if JSON parsing succeeded

If this fails: Malformed JSON in extension.json causes JSONDecodeError during file read, bypassing Pydantic validation and producing unhelpful error about file format rather than specific JSON syntax issue

superset-extensions-cli/src/superset_extensions_cli/cli.py:_create_manifest
Ordering

Assumes build steps execute in fixed order (frontend first, then backend, then manifest) but provides no rollback mechanism if later steps fail

If this fails: If manifest creation fails after successful frontend build, leaves extension in inconsistent state with built assets but no manifest, requiring manual cleanup or full rebuild

superset-extensions-cli/src/superset_extensions_cli/cli.py:build_command
Environment

Assumes file system events fire in deterministic order and that file writes are atomic, but watchdog can deliver events out of order or for partial writes

If this fails: Rapid file changes during development can trigger multiple concurrent builds or attempt to read partially written files, leading to build failures or corrupted output

superset-extensions-cli/src/superset_extensions_cli/cli.py:WatchHandler
Scale

Assumes extension files fit comfortably in memory when creating zip archive, with no size limits or streaming for large extensions

If this fails: Extensions with large assets (videos, datasets, ML models) cause MemoryError during zip creation, with no indication of size limits or alternative approaches

superset-extensions-cli/src/superset_extensions_cli/cli.py:create_zip_bundle
Domain

Assumes technical names follow DNS-like naming conventions (lowercase, hyphens) but many developers expect underscore_separated or camelCase naming from other ecosystems

If this fails: Valid Python package names like 'my_extension' or JavaScript conventions like 'myExtension' are rejected, forcing developers to rename projects and potentially break existing imports

superset-core/src/superset_core/extensions/constants.py:TECHNICAL_NAME_PATTERN
Temporal

Assumes file system events represent completed file operations, but editors and build tools often write files incrementally or use temporary files

If this fails: Watch mode triggers rebuilds on temporary files or partial saves, wasting CPU and potentially causing build errors when reading incomplete files mid-write

superset-extensions-cli/src/superset_extensions_cli/cli.py:watch_command
Contract

Assumes MCP server package is available for import but only catches ImportError in try/except block, providing no fallback or clear error message about missing dependency

If this fails: Extensions using @tool decorator fail to load with ImportError if MCP server not installed, but error message doesn't explain that MCP is optional dependency

superset-core/src/superset_core/mcp/decorators.py:tool
Environment

Assumes npm version 10.8.2+ has consistent CLI interface and behavior across platforms, but npm commands can vary between Windows, macOS, and Linux

If this fails: Build scripts that work on developer's macOS machine may fail on Linux CI/CD with different npm behavior, especially around path handling and permissions

superset-extensions-cli/src/superset_extensions_cli/cli.py:MIN_NPM_VERSION

See the full structural analysis of superset: the pipeline, data models, and system behavior that put these assumptions in context.

Full analysis of apache/superset →

Compare superset

Frequently Asked Questions

What does superset assume that could break in production?

The one most likely to cause trouble: Assumes npm command is available in PATH and returns version in format 'v{major}.{minor}.{patch}' when called with --version, but never validates the output format before parsing If this fails, If npm returns unexpected version format or is aliased to different tool, semver.compare() will crash with parsing error instead of graceful failure message

How many hidden assumptions does superset have?

CodeSea found 13 assumptions superset relies on but never validates, 3 of them critical, spanning Environment, Contract, Domain, Resource, Shape, Ordering, Scale, Temporal. Most are routine — the analysis flags the two or three most likely to actually bite.

What is a hidden assumption?

Something the code depends on but never checks: a data shape, an ordering, an environment condition, a scale limit, or a contract with another service. It holds until the world it runs in changes, then fails silently.