How Prefect Works

Prefect positions itself as the anti-Airflow: no DAGs, no operators, just Python. Under the hood, though, it solves the same fundamental problem — making dependencies between tasks visible and their execution reliable.

22,206 stars Python 10 components 7-stage pipeline

What prefect Does

Orchestrates Python data pipeline execution with scheduling, retries, and distributed task management

Prefect is a workflow orchestration framework that transforms Python functions into resilient, observable data pipelines. It provides a client-server architecture where flow runs are scheduled and monitored by a FastAPI server, executed by workers, and tracked through a comprehensive state management system with automatic retries, caching, and event-driven automations.

Architecture Overview

prefect is organized into 6 layers, with 10 components and 0 connections between them.

Client SDK

Python decorators (@flow, @task) that instrument user code and communicate with server via REST API

Server API

FastAPI application handling flow runs, deployments, work queues, and serving the web UI

Database Layer

SQLAlchemy models with Alembic migrations storing flow runs, task runs, deployments, and events

Engine Layer

Flow and task execution engines that handle state transitions, retries, and result persistence

Worker Infrastructure

Pluggable workers that poll work queues and execute flows in various environments (processes, Docker, Kubernetes)

Web UI

Vue.js applications for monitoring flows, managing deployments, and configuring automations

How Data Flows Through prefect

Flow definitions with @flow/@task decorators are submitted as deployments to the server database. The scheduler creates flow runs and places them in work queues. Workers poll queues, retrieve flow runs, and execute them using the flow engine which manages task execution, state transitions, and result persistence. Events are emitted throughout for observability and automation triggers.

1Flow definition and deployment

User defines flows with @flow decorator and creates deployments via CLI using Flow.deploy() method, storing flow code references and configuration in server database

2Schedule-based flow run creation

Scheduler service queries deployments with schedules, creates FlowRun instances using deployment parameters, and queues them in specified work queues

3Worker polling and run acquisition

Workers poll work queues via PrefectClient.get_runs_in_work_queue(), acquire FlowRun assignments, and prepare execution environment based on infrastructure configuration

4Flow execution and task orchestration

FlowRunner loads flow code, resolves parameters from blocks, executes @flow function which creates TaskRun instances for each @task call, managing parallel execution and dependencies

5Task execution with caching and retries

TaskRunner executes individual tasks, checks cache via cache_key lookup, handles retries on failure, and persists results using ResultStore

6State management and persistence

StateManager validates state transitions, updates FlowRun/TaskRun states in database via PrefectClient, and triggers event emission for state changes

7Event processing and automation

EventEmitter publishes events to automation engine, which evaluates trigger conditions and executes actions like creating new flow runs or sending notifications

System Dynamics

Beyond the pipeline, prefect has runtime behaviors that shape how it responds to load, failures, and configuration changes.

Data Pools

Pool

SQLite/PostgreSQL Database

Stores all persistent state including flow runs, task runs, deployments, work queues, and events

Type: database

Pool

Work Queues

Queues containing flow runs ready for execution, organized by priority and filtering rules

Type: queue

Pool

Result Storage

Pluggable storage backends (filesystem, S3, GCS) for persisting flow and task results

Type: file-store

Pool

Block Registry

In-memory and database storage of configuration blocks for reusable infrastructure and credentials

Type: registry

Pool

Event Stream

Async message queue for events that trigger automations and provide observability

Type: queue

Feedback Loops

Loop

Task retry loop

Trigger: Task failure with retries configured → TaskRunner waits exponential backoff delay then re-executes task with same parameters (exits when: Task succeeds or retry limit reached)

Type: retry

Loop

Worker heartbeat polling

Trigger: Worker starts and joins work pool → Worker sends heartbeat and polls for new flow runs every polling_interval seconds (exits when: Worker shutdown or pool deletion)

Type: polling

Loop

Scheduler deployment polling

Trigger: Scheduler service startup → Queries deployments for due schedules and creates flow runs every scheduling_interval (exits when: Service shutdown)

Type: polling

Loop

Automation trigger evaluation

Trigger: Event matching automation trigger condition → Creates new flow run or performs configured action (exits when: Trigger condition no longer met or automation disabled)

Type: auto-scale

Loop

Flow run state propagation

Trigger: Flow run state change → Updates child task run states and parent flow context (exits when: All state updates completed)

Type: self-correction

Control Points

Control

PREFECT_API_URL

Control

PREFECT_WORKER_QUERY_SECONDS

Control

task_runner

Control

retries

Control

cache_policy

Control

infrastructure

Control

log_level

Control

concurrency_limit

Delays

Delay

Task retry backoff

Duration: exponential: retry_delay_seconds * (2 ** attempt)

Delay

Worker polling interval

Duration: PREFECT_WORKER_QUERY_SECONDS (default 10s)

Delay

Database connection pool

Duration: Connection acquisition timeout

Delay

Event processing delay

Duration: Event emission to automation trigger evaluation

Delay

Flow run scheduling

Duration: Schedule interval or cron expression

Technology Choices

prefect is built with 12 key technologies. Each serves a specific role in the system.

FastAPI

Provides REST API server for flow orchestration and web UI serving

SQLAlchemy

ORM for database operations with async support for flow/task persistence

Alembic

Database migration system for evolving Prefect server schema

Pydantic

Data validation and serialization for API schemas and configuration

Vue.js

Frontend framework for both legacy UI and next-gen UI applications

Uvicorn

ASGI server running the FastAPI application

Docker

Container infrastructure for flow execution environments

Asyncio

Async execution framework for concurrent task processing

httpx

HTTP client for server communication with async support

CloudPickle

Serialization of flow/task code for remote execution

Rich

Terminal formatting for CLI output and logging

Click

Command-line interface framework

Key Components

FlowRunner (orchestrator): Orchestrates flow execution including parameter resolution, task scheduling, and state management
TaskRunner (executor): Executes individual tasks with retry logic, caching, and result persistence
PrefectClient (gateway): HTTP client for communicating with Prefect server API, handles authentication and retries
Scheduler (scheduler): Creates flow runs from deployments based on schedules and triggers
Worker (executor): Polls work queues and executes flow runs in configured infrastructure environments
DatabaseInterface (adapter): Abstracts database operations across SQLite and PostgreSQL with connection pooling
StateManager (processor): Manages state transitions and validation for flows and tasks
ResultStore (store): Persists and retrieves flow/task results using pluggable storage backends
BlockRegistry (registry): Manages registration and instantiation of configuration blocks with validation
EventEmitter (adapter): Emits events during flow/task execution for automation triggers and observability

Who Should Read This

Data engineers comparing orchestration tools, or Python developers who want workflow automation without the boilerplate.

This analysis was generated by CodeSea from the prefecthq/prefect source code. For the full interactive visualization — including pipeline graph, architecture diagram, and system behavior map — see the complete analysis.

Explore Further

Full Analysis

Interactive architecture map for prefect

prefect vs dbt-core

Side-by-side architecture comparison

prefect vs celery

Side-by-side architecture comparison

prefect vs luigi

Side-by-side architecture comparison

How Apache Airflow Works

Data Pipelines

How dbt Works

Data Pipelines

Frequently Asked Questions

What is prefect?

Orchestrates Python data pipeline execution with scheduling, retries, and distributed task management

How does prefect's pipeline work?

prefect processes data through 7 stages: Flow definition and deployment, Schedule-based flow run creation, Worker polling and run acquisition, Flow execution and task orchestration, Task execution with caching and retries, and more. Flow definitions with @flow/@task decorators are submitted as deployments to the server database. The scheduler creates flow runs and places them in work queues. Workers poll queues, retrieve flow runs, and execute them using the flow engine which manages task execution, state transitions, and result persistence. Events are emitted throughout for observability and automation triggers.

What tech stack does prefect use?

prefect is built with FastAPI (Provides REST API server for flow orchestration and web UI serving), SQLAlchemy (ORM for database operations with async support for flow/task persistence), Alembic (Database migration system for evolving Prefect server schema), Pydantic (Data validation and serialization for API schemas and configuration), Vue.js (Frontend framework for both legacy UI and next-gen UI applications), and 7 more technologies.

How does prefect handle errors and scaling?

prefect uses 5 feedback loops, 8 control points, 5 data pools to manage its runtime behavior. These mechanisms handle error recovery, load distribution, and configuration changes.

How does prefect compare to dbt-core?

CodeSea has detailed side-by-side architecture comparisons of prefect with dbt-core, celery, luigi. These cover tech stack differences, pipeline design, and system behavior.

How Prefect Works

What prefect Does

Architecture Overview

How Data Flows Through prefect

1Flow definition and deployment

2Schedule-based flow run creation

3Worker polling and run acquisition

4Flow execution and task orchestration

5Task execution with caching and retries

6State management and persistence

7Event processing and automation

System Dynamics

Data Pools

SQLite/PostgreSQL Database

Work Queues

Result Storage

Block Registry

Event Stream

Feedback Loops

Task retry loop

Worker heartbeat polling

Scheduler deployment polling

Automation trigger evaluation

Flow run state propagation

Control Points

PREFECT_API_URL

PREFECT_WORKER_QUERY_SECONDS

task_runner

retries

cache_policy

infrastructure

log_level

concurrency_limit

Delays

Task retry backoff

Worker polling interval

Database connection pool

Event processing delay

Flow run scheduling

Technology Choices

Key Components

Who Should Read This

Explore Further

Full Analysis

prefect vs dbt-core

prefect vs celery

prefect vs luigi

How Apache Airflow Works

How dbt Works

Frequently Asked Questions

Visualize prefect yourself