prefecthq/prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.

22,206 stars Python 10 components

Orchestrates Python data pipeline execution with scheduling, retries, and distributed task management

Flow definitions with @flow/@task decorators are submitted as deployments to the server database. The scheduler creates flow runs and places them in work queues. Workers poll queues, retrieve flow runs, and execute them using the flow engine which manages task execution, state transitions, and result persistence. Events are emitted throughout for observability and automation triggers.

Under the hood, the system uses 5 feedback loops, 5 data pools, 8 control points to manage its runtime behavior.

A 10-component repository. 3350 files analyzed. Data flows through 7 distinct pipeline stages.

How Data Flows Through the System

Flow definition and deployment — User defines flows with @flow decorator and creates deployments via CLI using Flow.deploy() method, storing flow code references and configuration in server database [Flow → Deployment]
Schedule-based flow run creation — Scheduler service queries deployments with schedules, creates FlowRun instances using deployment parameters, and queues them in specified work queues [Deployment → FlowRun]
Worker polling and run acquisition — Workers poll work queues via PrefectClient.get_runs_in_work_queue(), acquire FlowRun assignments, and prepare execution environment based on infrastructure configuration [WorkQueue → FlowRun]
Flow execution and task orchestration — FlowRunner loads flow code, resolves parameters from blocks, executes @flow function which creates TaskRun instances for each @task call, managing parallel execution and dependencies [FlowRun → TaskRun]
Task execution with caching and retries — TaskRunner executes individual tasks, checks cache via cache_key lookup, handles retries on failure, and persists results using ResultStore [TaskRun → Task results]
State management and persistence — StateManager validates state transitions, updates FlowRun/TaskRun states in database via PrefectClient, and triggers event emission for state changes [State → Event]
Event processing and automation — EventEmitter publishes events to automation engine, which evaluates trigger conditions and executes actions like creating new flow runs or sending notifications [Event → Automation actions]

Data Models

The data structures that flow between stages — the contracts that hold the system together.

FlowRun src/prefect/server/schemas/core.py
Pydantic model with id: UUID, flow_id: UUID, state: State, parameters: dict, start_time: datetime, end_time: datetime, deployment_id: Optional[UUID]
Created when flow is called, executed by engine with state transitions, persisted to database, displayed in UI

TaskRun src/prefect/server/schemas/core.py
Pydantic model with id: UUID, task_key: str, flow_run_id: UUID, state: State, parameters: dict, cache_key: Optional[str]
Created during flow execution when @task functions are called, executed by task engine, potentially cached

State src/prefect/client/schemas/objects.py
Pydantic model with type: StateType (Pending/Running/Completed/Failed/Cancelled), message: str, data: Any, state_details: StateDetails
Represents current execution status, transitions through lifecycle, persisted for observability

Deployment src/prefect/server/schemas/core.py
Pydantic model with id: UUID, flow_id: UUID, name: str, work_queue_name: str, parameters: dict, schedule: Optional[Schedule], infrastructure: dict
Created via CLI/API, stored in database, used by scheduler to create flow runs, executed by workers

WorkQueue src/prefect/server/schemas/core.py
Pydantic model with id: UUID, name: str, filter: Optional[dict], concurrency_limit: Optional[int]
Created for organizing work, receives flow runs from scheduler, polled by workers for execution

Block src/prefect/blocks/core.py
Pydantic model with _block_type_name: str, _block_document_name: str, plus dynamic fields for configuration storage
Created to store reusable configuration, referenced in flows/deployments, resolved at runtime

Event src/prefect/events/schemas/events.py
Pydantic model with event: str, resource: dict, payload: dict, occurred: datetime, id: UUID
Emitted during flow/task execution, stored in database, processed by automations, displayed in UI

Hidden Assumptions

Things this code relies on but never validates. These are the things that cause silent failures when the system changes.

critical Domain unguarded

assumes HTML element with id 'app' exists in the DOM for Vue.js to mount to

If this fails: Vue app fails to mount, entire UI doesn't render, white screen of death

ui/src/main.ts:start

warning Environment unguarded

assumes VITE_AMPLITUDE_API_KEY environment variable format is valid for Amplitude SDK without validation

If this fails: Amplitude initialization silently fails or crashes with invalid API key format, analytics data lost

ui-v2/src/analytics/index.ts:initAmplitude

warning Resource unguarded

assumes sessionStorage is available and writable in the browser environment

If this fails: tracking fails in incognito mode or browsers with disabled storage, duplicate events sent

ui-v2/src/analytics/index.ts:trackWebAppLoaded

critical Shape weakly guarded

assumes createWorkspaceRouteRecords function accepts all provided route components and they have compatible interfaces

If this fails: route registration fails at runtime, navigation breaks with cryptic errors

ui/src/router/index.ts:workspaceRoutes

critical Environment unguarded

assumes BASE_URL from meta utilities is a valid URL path for web history routing

If this fails: router creates invalid routes, navigation fails or redirects to wrong paths

ui/src/router/index.ts:BASE_URL

warning Ordering unguarded

assumes plugin installation order (router, PrefectDesign, PrefectUILibrary) doesn't matter for component dependencies

If this fails: Vue components fail to resolve dependencies, runtime errors when accessing plugin features

ui/src/main.ts:start

warning Domain weakly guarded

assumes all mapper functions in designMaps object have compatible signatures with spread operator

If this fails: object spread fails with non-enumerable properties, mapping functions unavailable at runtime

ui/src/maps/index.ts:maps

critical Environment weakly guarded

assumes DBT_ environment variables are properly formatted for Pydantic field types (Path, etc.)

If this fails: Pydantic validation fails with cryptic type errors, dbt integration breaks during initialization

src/integrations/prefect-dbt/prefect_dbt/core/settings.py:PrefectDbtSettings

critical Resource unguarded

assumes current working directory and standard dbt profile locations are readable and accessible

If this fails: profiles_dir lookup fails in containerized or restricted environments, dbt commands fail to find configuration

src/integrations/prefect-dbt/prefect_dbt/core/settings.py:find_profiles_dir

info Temporal unguarded

assumes session storage persists for the lifetime of a single browser session without clearing

If this fails: web app loaded events get tracked multiple times per session if storage is cleared, analytics data inflated

ui-v2/src/analytics/index.ts:SESSION_STORAGE_KEY

System Behavior

How the system operates at runtime — where data accumulates, what loops, what waits, and what controls what.

Data Pools

SQLite/PostgreSQL Database (database)
Stores all persistent state including flow runs, task runs, deployments, work queues, and events

Work Queues (queue)
Queues containing flow runs ready for execution, organized by priority and filtering rules

Result Storage (file-store)
Pluggable storage backends (filesystem, S3, GCS) for persisting flow and task results

Block Registry (registry)
In-memory and database storage of configuration blocks for reusable infrastructure and credentials

Event Stream (queue)
Async message queue for events that trigger automations and provide observability

Feedback Loops

Task retry loop (retry, balancing) — Trigger: Task failure with retries configured. Action: TaskRunner waits exponential backoff delay then re-executes task with same parameters. Exit: Task succeeds or retry limit reached.
Worker heartbeat polling (polling, reinforcing) — Trigger: Worker starts and joins work pool. Action: Worker sends heartbeat and polls for new flow runs every polling_interval seconds. Exit: Worker shutdown or pool deletion.
Scheduler deployment polling (polling, reinforcing) — Trigger: Scheduler service startup. Action: Queries deployments for due schedules and creates flow runs every scheduling_interval. Exit: Service shutdown.
Automation trigger evaluation (auto-scale, reinforcing) — Trigger: Event matching automation trigger condition. Action: Creates new flow run or performs configured action. Exit: Trigger condition no longer met or automation disabled.
Flow run state propagation (self-correction, balancing) — Trigger: Flow run state change. Action: Updates child task run states and parent flow context. Exit: All state updates completed.

Delays

Task retry backoff (rate-limit, ~exponential: retry_delay_seconds * (2 ** attempt)) — Prevents overwhelming failing resources while allowing recovery
Worker polling interval (scheduled-job, ~PREFECT_WORKER_QUERY_SECONDS (default 10s)) — Controls frequency of work queue polling and resource usage
Database connection pool (queue-drain, ~Connection acquisition timeout) — Blocks API requests when connection pool exhausted
Event processing delay (async-processing, ~Event emission to automation trigger evaluation) — Automations react after event processing latency
Flow run scheduling (scheduled-job, ~Schedule interval or cron expression) — Determines when scheduled flows execute

Control Points

PREFECT_API_URL (env-var) — Controls: Which Prefect server the client connects to. Default: http://127.0.0.1:4200/api
PREFECT_WORKER_QUERY_SECONDS (env-var) — Controls: How frequently workers poll for new flow runs. Default: 10
task_runner (runtime-toggle) — Controls: Execution strategy for tasks (sequential, concurrent, distributed). Default: ConcurrentTaskRunner
retries (hyperparameter) — Controls: How many times a failed task will retry execution. Default: 0
cache_policy (feature-flag) — Controls: Whether task results are cached and cache key strategy. Default: None
infrastructure (architecture-switch) — Controls: Where and how flow runs execute (local, Docker, Kubernetes). Default: Process
log_level (env-var) — Controls: Verbosity of logging output. Default: INFO
concurrency_limit (rate-limit) — Controls: Maximum concurrent executions for tagged runs

Technology Stack

FastAPI (framework)
Provides REST API server for flow orchestration and web UI serving

SQLAlchemy (database)
ORM for database operations with async support for flow/task persistence

Alembic (database)
Database migration system for evolving Prefect server schema

Pydantic (serialization)
Data validation and serialization for API schemas and configuration

Vue.js (framework)
Frontend framework for both legacy UI and next-gen UI applications

Uvicorn (runtime)
ASGI server running the FastAPI application

Docker (infra)
Container infrastructure for flow execution environments

Asyncio (runtime)
Async execution framework for concurrent task processing

httpx (library)
HTTP client for server communication with async support

CloudPickle (serialization)
Serialization of flow/task code for remote execution

Rich (library)
Terminal formatting for CLI output and logging

Click (library)
Command-line interface framework

Key Components

FlowRunner (orchestrator) — Orchestrates flow execution including parameter resolution, task scheduling, and state management src/prefect/flow_engine.py
TaskRunner (executor) — Executes individual tasks with retry logic, caching, and result persistence src/prefect/task_engine.py
PrefectClient (gateway) — HTTP client for communicating with Prefect server API, handles authentication and retries src/prefect/client/orchestration.py
Scheduler (scheduler) — Creates flow runs from deployments based on schedules and triggers src/prefect/server/services/scheduler.py
Worker (executor) — Polls work queues and executes flow runs in configured infrastructure environments src/prefect/workers/base.py
DatabaseInterface (adapter) — Abstracts database operations across SQLite and PostgreSQL with connection pooling src/prefect/server/database/interface.py
StateManager (processor) — Manages state transitions and validation for flows and tasks src/prefect/states.py
ResultStore (store) — Persists and retrieves flow/task results using pluggable storage backends src/prefect/results.py
BlockRegistry (registry) — Manages registration and instantiation of configuration blocks with validation src/prefect/blocks/core.py
EventEmitter (adapter) — Emits events during flow/task execution for automation triggers and observability src/prefect/events/client.py

Explore the interactive analysis

See the full architecture map, data flow, and code patterns visualization.

Analyze on CodeSea

Compare prefect

Related Repository Repositories

Frequently Asked Questions

What is prefect used for?

Orchestrates Python data pipeline execution with scheduling, retries, and distributed task management prefecthq/prefect is a 10-component repository written in Python. Data flows through 7 distinct pipeline stages. The codebase contains 3350 files.

How is prefect architected?

prefect is organized into 6 architecture layers: Client SDK, Server API, Database Layer, Engine Layer, and 2 more. Data flows through 7 distinct pipeline stages. This layered structure keeps concerns separated and modules independent.

How does data flow through prefect?

Data moves through 7 stages: Flow definition and deployment → Schedule-based flow run creation → Worker polling and run acquisition → Flow execution and task orchestration → Task execution with caching and retries → .... Flow definitions with @flow/@task decorators are submitted as deployments to the server database. The scheduler creates flow runs and places them in work queues. Workers poll queues, retrieve flow runs, and execute them using the flow engine which manages task execution, state transitions, and result persistence. Events are emitted throughout for observability and automation triggers. This pipeline design reflects a complex multi-stage processing system.

What technologies does prefect use?

The core stack includes FastAPI (Provides REST API server for flow orchestration and web UI serving), SQLAlchemy (ORM for database operations with async support for flow/task persistence), Alembic (Database migration system for evolving Prefect server schema), Pydantic (Data validation and serialization for API schemas and configuration), Vue.js (Frontend framework for both legacy UI and next-gen UI applications), Uvicorn (ASGI server running the FastAPI application), and 6 more. This broad technology surface reflects a mature project with many integration points.

What system dynamics does prefect have?

prefect exhibits 5 data pools (SQLite/PostgreSQL Database, Work Queues), 5 feedback loops, 8 control points, 5 delays. The feedback loops handle retry and polling. These runtime behaviors shape how the system responds to load, failures, and configuration changes.

What design patterns does prefect use?

6 design patterns detected: Decorator-based instrumentation, Async context propagation, Pluggable infrastructure adapters, Event-driven automation, State machine orchestration, and 1 more.

How does prefect compare to alternatives?

CodeSea has side-by-side architecture comparisons of prefect with dbt-core, celery, luigi. These comparisons show tech stack differences, pipeline design, system behavior, and code patterns. See the comparison pages above for detailed analysis.

Analyzed on April 19, 2026 by CodeSea. Written by Karolina Sarna.

prefecthq/prefect

How Data Flows Through the System

Data Models

Hidden Assumptions

System Behavior

Data Pools

Feedback Loops

Delays

Control Points

Technology Stack

Key Components

Explore the interactive analysis

Compare prefect

prefect vs Dbt Core

prefect vs Celery

prefect vs Luigi

Related Repository Repositories

redis/redis

crewaiinc/crewai

dandavison/delta

statelyai/xstate

chroma-core/chroma

kestra-io/kestra

Frequently Asked Questions