elementary-data/elementary
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
dbt-native data observability platform for monitoring data pipelines
Data flows from dbt warehouse metadata through monitoring APIs to generate observability reports and alerts
Under the hood, the system uses 2 feedback loops, 3 data pools, 5 control points to manage its runtime behavior.
Structural Verdict
A 12-component data pipeline with 4 connections. 280 files analyzed. Loosely coupled — components are relatively independent.
How Data Flows Through the System
Data flows from dbt warehouse metadata through monitoring APIs to generate observability reports and alerts
- dbt Metadata Collection — dbt runners execute operations to extract test results, model runs, and lineage from warehouse (config: dbt.project_dir, dbt.profiles_dir, dbt.target)
- Data Transformation — Raw dbt artifacts are parsed and transformed into structured schemas for monitoring APIs
- Report Generation — Monitoring APIs aggregate data into HTML reports with test results, model performance, and lineage (config: report.days_back, report.env, report.warehouse_type)
- Alert Processing — Failed tests and anomalies are formatted into structured messages for notification channels (config: alerts.subscribers, alerts.maximum_columns_in_alert_samples)
- Delivery — Reports uploaded to cloud storage (Azure/S3/GCS) and alerts sent to Slack/Teams (config: azure.connection_string, azure.container_name, slack.webhook)
System Behavior
How the system actually operates at runtime — where data accumulates, what loops, what waits, and what controls what.
Data Pools
dbt test results, model runs, and artifacts stored in data warehouse tables
Cached dbt operation results to avoid redundant warehouse queries
HTML reports stored in cloud blob storage for sharing
Feedback Loops
- dbt Command Retry Loop (retry, balancing) — Trigger: Transient error patterns detected in dbt output. Action: Exponential backoff retry with per-adapter error pattern matching. Exit: Success or max retries (3) reached.
- Package Version Check (polling, balancing) — Trigger: CLI startup. Action: Compare current version with PyPI latest and recommend upgrade. Exit: Version check complete or disabled.
Delays & Async Processing
- dbt Command Execution (async-processing, ~varies) — Data collection blocked until warehouse operations complete
- Exponential Backoff (rate-limit, ~10-60 seconds) — Retry delays for transient failures
- Cloud Upload (async-processing, ~varies) — Report delivery delayed by network transfer time
Control Points
- Disable Logo Print (env-var) — Controls: Whether Elementary logo is displayed on CLI startup
- Disable Version Check (env-var) — Controls: Whether automatic version upgrade recommendations are shown
- dbt Runner Method (env-var) — Controls: Which dbt execution method to use (subprocess/API/fusion)
- Transient Retry Count (threshold) — Controls: Maximum retries for transient dbt failures. Default: 3
- Debug Mode (env-var) — Controls: Verbose logging and debug output
Technology Stack
Data transformation and testing framework integration
CLI framework for command-line interface
Data validation and serialization
Cloud storage for reports
Slack integration for alerts
Retry logic with exponential backoff
Graph operations for lineage processing
HTML parsing and manipulation
AWS S3 integration
GCS integration
Key Components
- ElementaryCLI (cli-command) — Main CLI group that handles command routing, logging setup, and context management
elementary/cli/cli.py - DbtRunnerFactory (service) — Creates appropriate dbt runner instances (subprocess, API, or Fusion) based on configuration
elementary/clients/dbt/factory.py - CommandLineDbtRunner (class) — Abstract base for executing dbt commands with retry logic and transient error handling
elementary/clients/dbt/command_line_dbt_runner.py - APIClient (class) — Manages dbt runner interactions and caches operation results for the monitoring API
elementary/clients/api/api_client.py - AzureClient (service) — Handles uploading HTML reports to Azure Blob Storage with connection string authentication
elementary/clients/azure/client.py - SlackMessageBuilder (class) — Constructs Slack message payloads using Block Kit format for data observability alerts
elementary/clients/slack/slack_message_builder.py - ModelRunsSchema (type-def) — Pydantic models defining the structure of dbt model execution data and performance metrics
elementary/monitor/api/models/schema.py - TestResultsSchema (type-def) — Pydantic models for test execution results, including anomaly detection and sample data
elementary/monitor/api/tests/schema.py - LineageSchema (type-def) — Defines data lineage graph structure with nodes and edges for dependency visualization
elementary/monitor/api/lineage/schema.py - ReportDataSchema (type-def) — Main schema for the observability report containing aggregated test results, models, and metadata
elementary/monitor/api/report/schema.py - MessageBody (type-def) — Pydantic model for structured alert messages with blocks and color formatting
elementary/messages/message_body.py - Config (config) — Central configuration management for CLI options, dbt settings, and cloud provider credentials
elementary/config/config.py
Configuration
elementary/clients/dbt/command_line_dbt_runner.py (python-dataclass)
success(bool, unknown)output(Optional[str], unknown)stderr(Optional[str], unknown)
elementary/clients/dbt/dbt_log.py (python-dataclass)
msg(Optional[str], unknown)level(Optional[str], unknown)exception(Optional[str], unknown)
elementary/clients/slack/schema.py (python-pydantic)
text(Optional[str], unknown) — default: Noneattachments(Optional[list], unknown) — default: Noneblocks(Optional[list], unknown) — default: None
elementary/clients/slack/slack_message_builder.py (python-pydantic)
value(str, unknown)display_name(str, unknown)
Explore the interactive analysis
See the full architecture map, data flow, and code patterns visualization.
Analyze on CodeSeaRelated Data Pipeline Repositories
Frequently Asked Questions
What is elementary used for?
dbt-native data observability platform for monitoring data pipelines elementary-data/elementary is a 12-component data pipeline written in HTML. Loosely coupled — components are relatively independent. The codebase contains 280 files.
How is elementary architected?
elementary is organized into 5 architecture layers: CLI Layer, Client Layer, Monitor Engine, Operations Layer, and 1 more. Loosely coupled — components are relatively independent. This layered structure keeps concerns separated and modules independent.
How does data flow through elementary?
Data moves through 5 stages: dbt Metadata Collection → Data Transformation → Report Generation → Alert Processing → Delivery. Data flows from dbt warehouse metadata through monitoring APIs to generate observability reports and alerts This pipeline design reflects a complex multi-stage processing system.
What technologies does elementary use?
The core stack includes dbt-core (Data transformation and testing framework integration), Click (CLI framework for command-line interface), Pydantic (Data validation and serialization), Azure Blob Storage (Cloud storage for reports), Slack SDK (Slack integration for alerts), Tenacity (Retry logic with exponential backoff), and 4 more. This broad technology surface reflects a mature project with many integration points.
What system dynamics does elementary have?
elementary exhibits 3 data pools (dbt Warehouse Metadata, API Run Cache), 2 feedback loops, 5 control points, 3 delays. The feedback loops handle retry and polling. These runtime behaviors shape how the system responds to load, failures, and configuration changes.
What design patterns does elementary use?
5 design patterns detected: Pydantic Schema Pattern, Factory Pattern, Retry with Exponential Backoff, Plugin Architecture, Message Builder Pattern.
Analyzed on March 31, 2026 by CodeSea. Written by Karolina Sarna.