pgvector is an open-source PostgreSQL extension that adds the ability to store, index and search vector embeddings (numerical representations of data). It brings vector data and similarity search into the same system that holds application data, making it possible to power semantic search, recommendations and retrieval-augmented generation (RAG) without relying on an external vector database. pgvector extends Postgres to support these AI-driven use cases.
Many modern AI applications depend on retrieving semantically similar data, not just exact matches. pgvector allows teams to perform this type of retrieval at runtime within their existing Postgres stack. For example, applications often need to retrieve content that is contextually similar to a query, even if the wording is different. This approach is often referred to as cosine similarity, nearest neighbor search or embedding-based search.
This article provides a high-level, educational overview of pgvector rather than detailed implementation guidance.
pgvector adds a new data type to Postgres called vector. It allows embeddings, numerical representations of text, images or other content to be stored alongside relational data without requiring a separate system. These embeddings are typically generated by machine learning models that convert content such as text or images into numerical form.
At a high level, the process is simple. Embeddings are stored in the database. When a query is received, a query embedding is generated from the input, and pgvector returns the records whose vectors are most similar, or closest in meaning to that query. Instead of matching keywords, results are retrieved based on meaning.
pgvector determines similarity using distance metrics:
pgvector includes several features that make vector search practical within Postgres.
halfvec, sparsevec and bit types help reduce memory usage when working with large embedding datasets.pgvector is widely used to power AI-driven application features:
Applications can retrieve documents or content based on meaning rather than keywords. This is a core component of retrieval-augmented generation (RAG), where large language models use retrieved context to generate accurate, relevant responses. Because pgvector runs similarity search directly within Postgres, this retrieval can happen in real time without requiring a separate system.
Items can be matched to past behavior or preferences to support recommendations. This pattern is commonly used for product recommendations, content discovery and personalization in applications. pgvector makes it efficient to identify related items based on patterns in user behavior or content.
Image embeddings can be stored and compared to quickly find visually similar images. This is widely used in media platforms, e-commerce and creative tools. Storing these embeddings alongside application data makes it easier to run similarity searches without additional infrastructure.
Outliers can be identified by finding data points that are distant from typical patterns in vector space. This is useful for fraud detection, monitoring and quality control. pgvector enables this by making it easy to compare vectors and detect deviations.
Duplicate or near-duplicate content can be identified, even when it is expressed differently or formatted in different ways. This is important for content management, search quality and data hygiene. Similarity-based comparison makes it possible to detect duplicates beyond exact matches.
As vector search becomes part of more applications, teams often face a practical decision: should vector search stay within Postgres, or is a dedicated vector database needed? The answer depends on scale, performance requirements and operational complexity.
The differences can be summarized across key dimensions:
| Tool | Operational Complexity | Scalability Ceiling | Hybrid Query Support | Cost | Ecosystem Maturity |
pgvector | Lowest (Existing DB) | High (~100M+ vectors) | Best (Native SQL Joins) | Lowest (Included) | High (Postgres ecosystem) |
Pinecone | Low (Serverless/SaaS) | Highest (Billions+) | Moderate (Metadata only) | High (Usage-based) | High (AI-specific) |
Weaviate | Moderate (Multi-modal) | Very High | High (GraphQL/Vector) | Moderate | High (Open-source) |
Qdrant | Moderate (Rust-based) | Very High | High (Filtering-heavy) | Moderate | Growing Fast |
pgvector is the natural starting point for teams already using Postgres and operating below the scale ceiling. It works well when vector search is part of an existing application workflow and data volumes or query demands remain manageable. Dedicated vector databases become more relevant when query volume, recall requirements or multi-tenant workloads push beyond what Postgres can efficiently support.
pgvectorscale is designed for teams that want to extend how far they can go with pgvector before adopting a dedicated vector database. It addresses the performance and scalability challenges that arise as data volumes and query demands increase, particularly around indexing speed and query latency. By improving how pgvector performs at larger scales, it allows teams to continue using Postgres for longer without re-architecting their systems. This makes it a practical intermediate step for applications approaching the limits of what pgvector can handle on its own.
pgvector is powerful, but it comes with tradeoffs:
Understanding these limitations helps determine when pgvector is sufficient and when additional infrastructure may be needed.
pgvector can be installed on macOS and most Linux distributions using standard package managers such as Homebrew. It is also available on many managed Postgres platforms, including AWS RDS, Supabase, Azure Database for PostgreSQL, Google Cloud SQL and Neon.
Installation and setup instructions are available in the official pgvector GitHub repository, which includes step-by-step guidance maintained by the project’s authors.
Databricks customers using Postgres can also reference the Databricks OLTP extensions docs for platform-specific guidance.
pgvector operates in the operational serving layer of an AI system, where low-latency retrieval is required at application runtime. It is commonly used to support semantic search, recommendations and retrieval-augmented generation (RAG) within applications.
In contrast, Databricks Mosaic AI Vector Search is better suited for large-scale, batch-processed AI workloads, where data pipelines are managed in the lakehouse. These environments support centralized data processing, large datasets and complex workflows.
These approaches are complementary, and teams often use both across different layers of the stack. pgvector supports real-time application queries, while platforms like Databricks handle large-scale data preparation, embedding generation and model-driven workflows.
Is pgvector a full vector database?
pgvector enables Postgres to store embeddings and perform similarity search directly on that data. However, it is not a purpose-built vector database. Dedicated vector databases provide additional scalability and performance optimizations for larger workloads.
What is the difference between HNSW and IVFFlat in pgvector?
HNSW is optimized for fast query performance and uses an in-memory graph structure, which requires more memory. IVFFlat has a lower memory footprint and organizes vectors into clusters through a training step, but performance can vary depending on the dataset and workload. The choice depends on whether speed or memory efficiency is the priority.
How many vectors can pgvector handle?
pgvector can typically handle millions to tens of millions of vectors, depending on hardware, indexing strategy and query patterns. As datasets grow, performance may decline without careful tuning or additional tooling. Factors such as available memory, index type and query frequency all influence scalability.
Does pgvector support cosine similarity?
Yes, pgvector supports cosine similarity as one of its primary distance metrics. It measures how closely two vectors point in the same direction, which often reflects semantic similarity in embedding-based applications. This makes it well suited for semantic search, recommendation systems and natural language processing.
Is pgvector free and open source?
Yes, pgvector is an open-source project released under a permissive license. It can be used with standard Postgres installations as well as many managed Postgres services. This makes it an accessible starting point for adding vector search capabilities.
Can pgvector do hybrid search?
Yes, pgvector can be combined with Postgres full-text search to support hybrid search. This allows results to balance semantic relevance with keyword matching, improving both accuracy and usability. Hybrid search is especially useful in scenarios such as product search and documentation search, where both meaning and exact terms are important.
pgvector is a practical starting point for any team that wants to add vector search to an existing Postgres application. By storing embeddings alongside relational data and supporting similarity search natively within the database, it removes the operational overhead of managing a separate vector store. For many workloads — semantic search, RAG pipelines, recommendations and anomaly detection — it delivers what teams need without requiring a new system.
As data volumes grow or query demands increase, pgvectorscale can extend how far teams go before a dedicated vector database becomes necessary. For organizations managing large-scale AI workloads across a unified data platform, Databricks Mosaic AI Vector Search offers a complementary approach designed for the lakehouse layer. Together, these tools give teams the flexibility to match their vector search infrastructure to their actual workload requirements — at any scale.
