How Prefect Works

Prefect positions itself as the anti-Airflow: no DAGs, no operators, just Python. Under the hood, though, it solves the same fundamental problem — making dependencies between tasks visible and their execution reliable.

22,206 stars Python 10 components 7-stage pipeline

What prefect Does

Orchestrates Python data pipeline execution with scheduling, retries, and distributed task management

Prefect is a workflow orchestration framework that transforms Python functions into resilient, observable data pipelines. It provides a client-server architecture where flow runs are scheduled and monitored by a FastAPI server, executed by workers, and tracked through a comprehensive state management system with automatic retries, caching, and event-driven automations.

Architecture Overview

prefect is organized into 6 layers, with 10 components and 0 connections between them.

Client SDK
Python decorators (@flow, @task) that instrument user code and communicate with server via REST API
Server API
FastAPI application handling flow runs, deployments, work queues, and serving the web UI
Database Layer
SQLAlchemy models with Alembic migrations storing flow runs, task runs, deployments, and events
Engine Layer
Flow and task execution engines that handle state transitions, retries, and result persistence
Worker Infrastructure
Pluggable workers that poll work queues and execute flows in various environments (processes, Docker, Kubernetes)
Web UI
Vue.js applications for monitoring flows, managing deployments, and configuring automations

How Data Flows Through prefect

Flow definitions with @flow/@task decorators are submitted as deployments to the server database. The scheduler creates flow runs and places them in work queues. Workers poll queues, retrieve flow runs, and execute them using the flow engine which manages task execution, state transitions, and result persistence. Events are emitted throughout for observability and automation triggers.

1Flow definition and deployment

User defines flows with @flow decorator and creates deployments via CLI using Flow.deploy() method, storing flow code references and configuration in server database

2Schedule-based flow run creation

Scheduler service queries deployments with schedules, creates FlowRun instances using deployment parameters, and queues them in specified work queues

3Worker polling and run acquisition

Workers poll work queues via PrefectClient.get_runs_in_work_queue(), acquire FlowRun assignments, and prepare execution environment based on infrastructure configuration

4Flow execution and task orchestration

FlowRunner loads flow code, resolves parameters from blocks, executes @flow function which creates TaskRun instances for each @task call, managing parallel execution and dependencies

5Task execution with caching and retries

TaskRunner executes individual tasks, checks cache via cache_key lookup, handles retries on failure, and persists results using ResultStore

6State management and persistence

StateManager validates state transitions, updates FlowRun/TaskRun states in database via PrefectClient, and triggers event emission for state changes

7Event processing and automation

EventEmitter publishes events to automation engine, which evaluates trigger conditions and executes actions like creating new flow runs or sending notifications

System Dynamics

Beyond the pipeline, prefect has runtime behaviors that shape how it responds to load, failures, and configuration changes.

Data Pools

Pool

SQLite/PostgreSQL Database

Stores all persistent state including flow runs, task runs, deployments, work queues, and events

Type: database

Pool

Work Queues

Queues containing flow runs ready for execution, organized by priority and filtering rules

Type: queue

Pool

Result Storage

Pluggable storage backends (filesystem, S3, GCS) for persisting flow and task results

Type: file-store

Pool

Block Registry

In-memory and database storage of configuration blocks for reusable infrastructure and credentials

Type: registry

Pool

Event Stream

Async message queue for events that trigger automations and provide observability

Type: queue

Feedback Loops

Loop

Task retry loop

Trigger: Task failure with retries configured → TaskRunner waits exponential backoff delay then re-executes task with same parameters (exits when: Task succeeds or retry limit reached)

Type: retry

Loop

Worker heartbeat polling

Trigger: Worker starts and joins work pool → Worker sends heartbeat and polls for new flow runs every polling_interval seconds (exits when: Worker shutdown or pool deletion)

Type: polling

Loop

Scheduler deployment polling

Trigger: Scheduler service startup → Queries deployments for due schedules and creates flow runs every scheduling_interval (exits when: Service shutdown)

Type: polling

Loop

Automation trigger evaluation

Trigger: Event matching automation trigger condition → Creates new flow run or performs configured action (exits when: Trigger condition no longer met or automation disabled)

Type: auto-scale

Loop

Flow run state propagation

Trigger: Flow run state change → Updates child task run states and parent flow context (exits when: All state updates completed)

Type: self-correction

Control Points

Control

PREFECT_API_URL

Control

PREFECT_WORKER_QUERY_SECONDS

Control

task_runner

Control

retries

Control

cache_policy

Control

infrastructure

Control

log_level

Control

concurrency_limit

Delays

Delay

Task retry backoff

Duration: exponential: retry_delay_seconds * (2 ** attempt)

Delay

Worker polling interval

Duration: PREFECT_WORKER_QUERY_SECONDS (default 10s)

Delay

Database connection pool

Duration: Connection acquisition timeout

Delay

Event processing delay

Duration: Event emission to automation trigger evaluation

Delay

Flow run scheduling

Duration: Schedule interval or cron expression

Technology Choices

prefect is built with 12 key technologies. Each serves a specific role in the system.

FastAPI
Provides REST API server for flow orchestration and web UI serving
SQLAlchemy
ORM for database operations with async support for flow/task persistence
Alembic
Database migration system for evolving Prefect server schema
Pydantic
Data validation and serialization for API schemas and configuration
Vue.js
Frontend framework for both legacy UI and next-gen UI applications
Uvicorn
ASGI server running the FastAPI application
Docker
Container infrastructure for flow execution environments
Asyncio
Async execution framework for concurrent task processing
httpx
HTTP client for server communication with async support
CloudPickle
Serialization of flow/task code for remote execution
Rich
Terminal formatting for CLI output and logging
Click
Command-line interface framework

Key Components

Who Should Read This

Data engineers comparing orchestration tools, or Python developers who want workflow automation without the boilerplate.

This analysis was generated by CodeSea from the prefecthq/prefect source code. For the full interactive visualization — including pipeline graph, architecture diagram, and system behavior map — see the complete analysis.

Explore Further

Frequently Asked Questions

What is prefect?

Orchestrates Python data pipeline execution with scheduling, retries, and distributed task management

How does prefect's pipeline work?

prefect processes data through 7 stages: Flow definition and deployment, Schedule-based flow run creation, Worker polling and run acquisition, Flow execution and task orchestration, Task execution with caching and retries, and more. Flow definitions with @flow/@task decorators are submitted as deployments to the server database. The scheduler creates flow runs and places them in work queues. Workers poll queues, retrieve flow runs, and execute them using the flow engine which manages task execution, state transitions, and result persistence. Events are emitted throughout for observability and automation triggers.

What tech stack does prefect use?

prefect is built with FastAPI (Provides REST API server for flow orchestration and web UI serving), SQLAlchemy (ORM for database operations with async support for flow/task persistence), Alembic (Database migration system for evolving Prefect server schema), Pydantic (Data validation and serialization for API schemas and configuration), Vue.js (Frontend framework for both legacy UI and next-gen UI applications), and 7 more technologies.

How does prefect handle errors and scaling?

prefect uses 5 feedback loops, 8 control points, 5 data pools to manage its runtime behavior. These mechanisms handle error recovery, load distribution, and configuration changes.

How does prefect compare to dbt-core?

CodeSea has detailed side-by-side architecture comparisons of prefect with dbt-core, celery, luigi. These cover tech stack differences, pipeline design, and system behavior.

Visualize prefect yourself

See the interactive pipeline graph, architecture diagram, and system behavior map.

See Full Analysis