When More Cores Hurts: The Vector Database Scaling Paradox in HPC

2026-06-08Distributed, Parallel, and Cluster Computing

Distributed, Parallel, and Cluster ComputingDatabases
AI summary

The authors studied how three popular vector databases (Qdrant, Milvus, and Weaviate), which are usually built for cloud computing, perform on supercomputers used for scientific tasks. They tested these databases on real scientific data and found that adding more computer power does not always lead to better performance, showing inefficiencies in their design. Their work suggests that these databases need to be redesigned specifically for high-performance computing (HPC) systems to work better with large scientific workloads.

vector databaseshigh-performance computing (HPC)QdrantMilvusWeaviatescalabilitylatencydistributed computingmultimodal embeddingscloud computing
Authors
Seth Ockerman, Song Young Oh, Amal Gueroudji, Rochana Chaturvedi, Philip Carns, Nicholas Chia, Matthieu Dorier, Robert Latham, Tanwi Mallick, Swan Perarnau, Robert Underwood, Kyle Chard, Ian Foster, Robert Ross, Shivaram Venkataraman
Abstract
Vector databases have been designed and optimized for cloud environments; however, emerging scientific AI workloads (e.g., molecular search, meteorological trajectory detection, and literature-driven hypothesis generation) demand efficient, scalable execution on HPC systems. We present a large-scale evaluation of three state-of-the-art vector databases -- Qdrant, Milvus, and Weaviate -- on two production supercomputers, scaling to 256 distributed workers across 64 compute nodes. We evaluate representative workload patterns -- mixed read/write and write-then-read -- using popular benchmarks, multimodal embeddings, and a novel real-world scientific dataset. Our results reveal that workload characteristics can limit latency reduction, additional cores can reduce query throughput by up to 30.67%, and scaling from 16 to 256 workers (16x) only yields a 5.46x improvement. This scaling paradox exposes the fundamental mismatch between cloud-oriented designs and HPC systems, highlighting the need for new, HPC-aware vector database designs.