milvus-io/milvus

Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

43,872 stars Go 8 components

Distributes vector similarity search across multiple nodes with real-time indexing

Vector data flows from client through proxy validation to DataNode for ingestion and index building, gets stored as segments in object storage, then loaded by QueryNodes for ANN search execution. Search requests follow the reverse path - proxy routes to QueryCoord for node assignment, QueryNodes execute parallel search on their segments, and results are merged before returning to client.

Under the hood, the system uses 3 feedback loops, 4 data pools, 3 control points to manage its runtime behavior.

A 8-component repository. 4309 files analyzed. Data flows through 6 distinct pipeline stages.

How Data Flows Through the System

Vector data flows from client through proxy validation to DataNode for ingestion and index building, gets stored as segments in object storage, then loaded by QueryNodes for ANN search execution. Search requests follow the reverse path - proxy routes to QueryCoord for node assignment, QueryNodes execute parallel search on their segments, and results are merged before returning to client.

  1. Client Request Validation — Proxy receives client requests, validates authentication using Casbin RBAC, checks collection existence against RootCoord metadata, and performs request format validation before routing [SearchRequest → Validated Request]
  2. Vector Data Ingestion — DataNode receives vector batches from streaming layer, validates schema compliance, builds growing segments with columnar storage, and constructs incremental indexes using configured index types [VectorData → Growing Segment]
  3. Segment Sealing and Indexing — When segments reach size threshold, DataCoord triggers sealing operation where DataNode builds final indexes (HNSW, IVF_FLAT, etc.), compresses data, and uploads to object storage [Growing Segment → Sealed Segment]
  4. Query Planning and Distribution — QueryCoordV2 receives search requests, determines which segments contain relevant data based on time ranges and bloom filters, and assigns query tasks to available QueryNodes [SearchRequest → Query Plan]
  5. Parallel ANN Search Execution — QueryNodes load assigned segment indexes into memory, execute approximate nearest neighbor search using FAISS/other engines, and return top-K candidates with distance scores [Query Plan → Search Results]
  6. Result Merging and Response — Proxy collects partial results from all QueryNodes, performs global top-K merge sorting by distance scores, applies output field projection, and returns final results to client [Search Results → Final Response]

Data Models

The data structures that flow between stages — the contracts that hold the system together.

Collection internal/core/
entity.Collection with Name: string, Schema: entity.Schema containing FieldSchema array with Name, DataType, Dimension for vectors, PrimaryKey flag, and IndexParams
Created via CreateCollection with schema definition, populated with vectors via Insert, queried via Search, and managed through lifecycle operations
VectorData client/column/
column.Vector interface with FloatVector: [][]float32[dim], BinaryVector: [][]byte, or SparseFloatVector with indices and values, plus metadata fields
Ingested through Insert operations, transformed by DataNode into segments with indexes, and retrieved during Search queries
SearchRequest internal/proxy/
milvuspb.SearchRequest with CollectionName: string, Vectors: [][]float32, TopK: int, SearchParams: key-value pairs for index parameters, OutputFields: []string
Created by client SDK, validated and routed by Proxy, executed across QueryNodes, and results merged before returning to client
Segment internal/storage/
Segment with ID: int64, vectors stored in columnar format, inverted indexes, bloom filters, and metadata including row count and time range
Created during data ingestion, sealed when size threshold reached, indexed by background compaction, and queried during search operations
TelemetryMetrics examples/telemetry_demo/main.go
telemetryOperation with Operation: string, Count: int64, LatencyP99: float64, Database: string, Collection: string, plus client info and status
Collected during client operations, aggregated in heartbeat intervals, pushed to server via telemetry protocol for monitoring and analysis

Hidden Assumptions

Things this code relies on but never validates. These are the things that cause silent failures when the system changes.

critical Scale unguarded

Vector dimension is fixed at 128 for all collections and operations, with numEntities hardcoded to 500

If this fails: Demo breaks silently if real data has different dimensions - vector operations will fail with cryptic FAISS errors or dimension mismatch panics

examples/telemetry_demo/main.go:dim
critical Environment unguarded

Milvus server is running on localhost:19530 with proxy HTTP server enabled on localhost:9091, and both are accepting connections

If this fails: Test hangs indefinitely on connection attempts if server is down, on different host, or behind firewall - no timeout or health check implemented

examples/telemetry_e2e_test/main.go:address
critical Temporal unguarded

10 second heartbeat interval is sufficient for telemetry data collection and command propagation during test execution

If this fails: Test may pass/fail randomly if network latency exceeds heartbeat window - metrics could be lost or commands delayed beyond test timeout

examples/telemetry_e2e_test/main.go:HeartbeatInterval
critical Contract unguarded

HTTP telemetry API returns JSON with exact field names (client_id, client_info, metrics) and nested structure matching struct tags

If this fails: JSON unmarshaling fails silently with zero values if server changes field names, adds required fields, or changes nesting - leads to empty metrics display

examples/telemetry_demo/main.go:telemetryClientResponse
warning Resource weakly guarded

Atomic operations on global state (receivedPushConfig, lastPushConfigPayload) are sufficient for thread safety without additional synchronization

If this fails: Race conditions during concurrent command processing could corrupt telemetry state or lose commands - atomic.Value doesn't guarantee consistency between reads and writes

examples/telemetry_demo/main.go:receivedPushConfig
warning Domain unguarded

Collection categories array contains valid string values that match expected telemetry filtering dimensions

If this fails: Telemetry filtering breaks if categories contain special characters, are empty, or don't match server-side enum values - results in missing or incorrectly categorized metrics

examples/telemetry_demo/main.go:collections
warning Ordering unguarded

Command handlers are registered before any telemetry operations occur, and handler registration completes synchronously

If this fails: Commands received during startup window are lost if handlers aren't ready - no queuing or replay mechanism for early commands

examples/telemetry_e2e_test/main.go:registerCommandHandlers
warning Scale unguarded

JSON config file fits in memory and contains reasonable number of message types (likely < 1000 entries)

If this fails: Code generator crashes with OOM if config file is huge - no streaming parser or size limits implemented

pkg/streaming/util/message/codegen/main.go:MessageReflectInfoTable
warning Environment weakly guarded

Command line arguments are well-formed strings without special characters, and file system supports creating files in current directory

If this fails: Tool crashes if args contain null bytes or current directory is read-only - no validation of arguments or output path permissions

cmd/tools/config/main.go:os.Args
info Contract unguarded

Input JSON config exactly matches JSONConfig struct fields with correct types - no extra fields or missing required fields

If this fails: Code generation produces invalid Go code if JSON contains unexpected fields or wrong types - compilation fails downstream with unclear errors

pkg/streaming/util/message/codegen/main.go:JSONConfig

System Behavior

How the system operates at runtime — where data accumulates, what loops, what waits, and what controls what.

Data Pools

etcd (database)
Distributed metadata store containing collection schemas, segment metadata, node assignments, and cluster configuration state
Object Storage (file-store)
Persistent storage for sealed vector segments, index files, and logs using S3-compatible backends like MinIO
Message Queue (queue)
Event streaming system (Pulsar/Kafka) for coordinating data flow between components and ensuring ordered message delivery
QueryNode Memory (cache)
In-memory segment and index cache for fast ANN search, with LRU eviction when memory pressure occurs

Feedback Loops

Delays

Control Points

Technology Stack

Go (runtime)
Primary language for distributed system logic, gRPC services, and client SDK implementation
gRPC/Protocol Buffers (framework)
Inter-service communication protocol providing type-safe, high-performance RPC between Milvus components
etcd (database)
Distributed key-value store for cluster metadata, service discovery, and configuration management with strong consistency
Apache Pulsar/Kafka (infra)
Message queue for coordinating data flow between components with guaranteed message ordering and durability
MinIO/S3 (database)
Object storage backend for persisting vector segments, index files, and write-ahead logs with high availability
FAISS (compute)
Vector similarity search library providing optimized ANN algorithms (HNSW, IVF) with GPU acceleration support

Key Components

Package Structure

client (library)
Go client SDK for Milvus, providing high-level APIs for vector database operations including collections, indexes, search, and bulk operations.
examples-telemetry_demo (app)
Demonstration programs showing Milvus telemetry features including client metrics collection, command push, and multi-database scenarios.
examples-telemetry_e2e_test (app)
End-to-end test for Milvus telemetry system, verifying command push and client metrics flow.
pkg (shared)
Shared package containing common utilities, configuration, streaming, message queues, storage adapters, and telemetry infrastructure used across Milvus components.
tests-go_client (tooling)
Test suite for the Go client SDK, providing test utilities and comprehensive validation of client operations.

Explore the interactive analysis

See the full architecture map, data flow, and code patterns visualization.

Analyze on CodeSea

Related Repository Repositories

Frequently Asked Questions

What is milvus used for?

Distributes vector similarity search across multiple nodes with real-time indexing milvus-io/milvus is a 8-component repository written in Go. Data flows through 6 distinct pipeline stages. The codebase contains 4309 files.

How is milvus architected?

milvus is organized into 5 architecture layers: Client Layer, Proxy Layer, Coordinator Layer, Worker Node Layer, and 1 more. Data flows through 6 distinct pipeline stages. This layered structure keeps concerns separated and modules independent.

How does data flow through milvus?

Data moves through 6 stages: Client Request Validation → Vector Data Ingestion → Segment Sealing and Indexing → Query Planning and Distribution → Parallel ANN Search Execution → .... Vector data flows from client through proxy validation to DataNode for ingestion and index building, gets stored as segments in object storage, then loaded by QueryNodes for ANN search execution. Search requests follow the reverse path - proxy routes to QueryCoord for node assignment, QueryNodes execute parallel search on their segments, and results are merged before returning to client. This pipeline design reflects a complex multi-stage processing system.

What technologies does milvus use?

The core stack includes Go (Primary language for distributed system logic, gRPC services, and client SDK implementation), gRPC/Protocol Buffers (Inter-service communication protocol providing type-safe, high-performance RPC between Milvus components), etcd (Distributed key-value store for cluster metadata, service discovery, and configuration management with strong consistency), Apache Pulsar/Kafka (Message queue for coordinating data flow between components with guaranteed message ordering and durability), MinIO/S3 (Object storage backend for persisting vector segments, index files, and write-ahead logs with high availability), FAISS (Vector similarity search library providing optimized ANN algorithms (HNSW, IVF) with GPU acceleration support). A focused set of dependencies that keeps the build manageable.

What system dynamics does milvus have?

milvus exhibits 4 data pools (etcd, Object Storage), 3 feedback loops, 3 control points, 3 delays. The feedback loops handle auto-scale and auto-scale. These runtime behaviors shape how the system responds to load, failures, and configuration changes.

What design patterns does milvus use?

4 design patterns detected: Coordinator-Worker Pattern, Segment-Based Storage, Pluggable Backend Architecture, Streaming Data Pipeline.

Analyzed on April 20, 2026 by CodeSea. Written by .