cube-js/cube
📊 Cube Core is open-source semantic layer for AI, BI and embedded analytics
Open-source semantic layer for AI, BI and embedded analytics
Queries flow from client APIs through schema compilation and optimization, then to data sources via drivers, with results cached in columnar storage for performance
Under the hood, the system uses 2 feedback loops, 2 data pools, 3 control points to manage its runtime behavior.
Structural Verdict
A 11-component dashboard with 1 connections. 2284 files analyzed. Minimal connections — components operate mostly in isolation.
How Data Flows Through the System
Queries flow from client APIs through schema compilation and optimization, then to data sources via drivers, with results cached in columnar storage for performance
- API Request — Client sends query via REST, GraphQL, or SQL to api-gateway
- Schema Resolution — schema-compiler resolves semantic definitions into logical query plan
- Query Planning — cubesqlplanner optimizes logical plan and checks for cached results
- Driver Execution — Database-specific driver executes SQL against data source if cache miss
- Result Caching — cubestore stores results in columnar format for future queries
- Response Formatting — Results transformed to client-requested format and returned via API
System Behavior
How the system actually operates at runtime — where data accumulates, what loops, what waits, and what controls what.
Data Pools
Columnar cache storing pre-aggregated query results and materialized views
WebSocket subscription registry tracking active real-time query subscriptions
Feedback Loops
- Query Cache Optimization (cache-invalidation, balancing) — Trigger: Cache hit/miss statistics. Action: Adjusts pre-aggregation strategies based on query patterns. Exit: Optimal cache hit ratio achieved.
- Schema Refresh Loop (polling, balancing) — Trigger: Configured refresh intervals. Action: Recompiles schema definitions and invalidates dependent caches. Exit: Manual stop or error condition.
Delays & Async Processing
- Pre-aggregation Build (batch-window, ~configurable intervals) — Queries may wait for pre-aggregation completion before returning cached results
- Schema Compilation (async-processing) — First query after schema change requires compilation before execution
- WebSocket Message Queue (queue-drain) — Real-time updates may be buffered during high-throughput periods
Control Points
- Cache TTL Configuration (env-var) — Controls: How long query results remain cached before refresh
- Pre-aggregation Strategy (runtime-toggle) — Controls: Whether to use pre-aggregations vs live queries for performance
- Driver Connection Pool Size (env-var) — Controls: Maximum concurrent connections to data sources
Package Structure
This monorepo contains 10 packages:
SQL query engine and planner written in Rust for processing semantic queries
High-performance columnar storage engine and cache layer built in Rust
Core TypeScript server orchestrating query processing and schema compilation
HTTP and WebSocket gateway providing REST, GraphQL, and SQL APIs
Compiles semantic schema definitions into executable queries
Manages query execution, caching, and pre-aggregation strategies
20+ database driver packages for BigQuery, Snowflake, Postgres, etc.
Frontend SDKs for React, Vue, Angular and vanilla JavaScript
Command-line interface for creating and managing Cube projects
Web-based development environment for building and testing queries
Technology Stack
Primary language for server logic and orchestration
High-performance SQL engine and columnar storage
Monorepo management and package publishing
JavaScript testing framework
Rust-Node.js native bindings
Cross-language data serialization
Key Components
- CubeCloudClient (class) — Handles deployment and communication with Cube Cloud infrastructure
packages/cubejs-backend-cloud/src/cloud.ts - QueryResult (class) — Parses and transforms query results from the Rust engine to JavaScript
rust/cubeorchestrator/src/query_message_parser.rs - CubeError (type-def) — Unified error handling across Rust-JavaScript bridge
rust/cubenativeutils/src/lib.rs - HLLDataSketch (class) — HyperLogLog implementation for approximate distinct count aggregations
rust/cubestore/cubedatasketches/src/lib.rs - configuration (module) — HTTP client configuration for communicating with Cube Core APIs
rust/cubesql/cubeclient/src/apis/mod.rs - generateXml (function) — Generates Maven POM files for Java driver dependency resolution
packages/cubejs-backend-maven/src/maven.ts - gateway (module) — Main API gateway exposing REST, GraphQL, and SQL endpoints
packages/cubejs-api-gateway/src/index.ts - SubscriptionServer (class) — Manages WebSocket connections for real-time query subscriptions
packages/cubejs-api-gateway/src/ws/index.ts - AthenaDriver (class) — Database driver implementation for Amazon Athena query engine
packages/cubejs-athena-driver/src/index.ts - cube_bridge (module) — Rust-JavaScript bridge for schema compilation and query planning
rust/cubesqlplanner/cubesqlplanner/src/cube_bridge/mod.rs - planner (module) — SQL query planning and optimization engine
rust/cubesqlplanner/cubesqlplanner/src/lib.rs
Sub-Modules
High-performance SQL processing and columnar storage engine
API orchestration, schema compilation, and query coordination
Unified interface to 20+ different data sources and databases
Frontend libraries for React, Vue, Angular, and vanilla JavaScript
CLI, playground, testing frameworks, and project templates
Configuration
codecov.yml (yaml)
coverage.round(string, unknown) — default: downcoverage.range(string, unknown) — default: 70...100coverage.status.project.default.informational(boolean, unknown) — default: truecoverage.status.project.default.target(string, unknown) — default: autocoverage.status.project.default.threshold(string, unknown) — default: 2%coverage.status.patch.default.informational(boolean, unknown) — default: truegithub_checks.annotations(boolean, unknown) — default: false
lerna.json (json)
version(string, unknown) — default: 1.6.29npmClient(string, unknown) — default: yarncommand.bootstrap.npmClient(string, unknown) — default: yarncommand.bootstrap.npmClientArgs(array, unknown) — default: --frozen-lockfilecommand.version.allowBranch(array, unknown) — default: master,lts/*$schema(string, unknown) — default: node_modules/lerna/schemas/lerna-schema.json
Explore the interactive analysis
See the full architecture map, data flow, and code patterns visualization.
Analyze on CodeSeaRelated Dashboard Repositories
Frequently Asked Questions
What is cube used for?
Open-source semantic layer for AI, BI and embedded analytics cube-js/cube is a 11-component dashboard written in Rust. Minimal connections — components operate mostly in isolation. The codebase contains 2284 files.
How is cube architected?
cube is organized into 5 architecture layers: Rust Engine Layer, TypeScript Orchestration Layer, Driver Abstraction Layer, API Gateway Layer, and 1 more. Minimal connections — components operate mostly in isolation. This layered structure keeps concerns separated and modules independent.
How does data flow through cube?
Data moves through 6 stages: API Request → Schema Resolution → Query Planning → Driver Execution → Result Caching → .... Queries flow from client APIs through schema compilation and optimization, then to data sources via drivers, with results cached in columnar storage for performance This pipeline design reflects a complex multi-stage processing system.
What technologies does cube use?
The core stack includes TypeScript (Primary language for server logic and orchestration), Rust (High-performance SQL engine and columnar storage), Lerna (Monorepo management and package publishing), Jest (JavaScript testing framework), Neon (Rust-Node.js native bindings), FlatBuffers (Cross-language data serialization). A focused set of dependencies that keeps the build manageable.
What system dynamics does cube have?
cube exhibits 2 data pools (cubestore, Subscription Store), 2 feedback loops, 3 control points, 3 delays. The feedback loops handle cache-invalidation and polling. These runtime behaviors shape how the system responds to load, failures, and configuration changes.
What design patterns does cube use?
5 design patterns detected: Driver Pattern, Rust-JavaScript Bridge, Multi-Protocol API Gateway, Semantic Schema Compilation, Caching and Pre-aggregation.
Analyzed on March 31, 2026 by CodeSea. Written by Karolina Sarna.