Skip to main content

What is pgvector?

A PostgreSQL extension for storing and searching vector embeddings natively within your existing database.

4 Personas AI Agents 6

Summary

  • pgvector is a PostgreSQL extension that enables vector storage and similarity search directly within Postgres, eliminating the need for a separate vector database for most AI use cases like semantic search, RAG, and recommendations.
  • It supports multiple distance metrics, two index types (HNSW and IVFFlat), filtered search, and hybrid search, all natively within SQL, making it operationally simple for teams already on Postgres.
  • pgvector handles millions of vectors well, but performance degrades at very high scale; pgvectorscale and dedicated vector databases like Pinecone or Weaviate are the natural next steps as workloads grow.

pgvector is an open-source PostgreSQL extension that adds the ability to store, index and search vector embeddings (numerical representations of data). It brings vector data and similarity search into the same system that holds application data, making it possible to power semantic search, recommendations and retrieval-augmented generation (RAG) without relying on an external vector database. pgvector extends Postgres to support these AI-driven use cases.

Many modern AI applications depend on retrieving semantically similar data, not just exact matches. pgvector allows teams to perform this type of retrieval at runtime within their existing Postgres stack. For example, applications often need to retrieve content that is contextually similar to a query, even if the wording is different. This approach is often referred to as cosine similarity, nearest neighbor search or embedding-based search.

This article provides a high-level, educational overview of pgvector rather than detailed implementation guidance.
 

How pgvector works

pgvector adds a new data type to Postgres called vector. It allows embeddings, numerical representations of text, images or other content to be stored alongside relational data without requiring a separate system. These embeddings are typically generated by machine learning models that convert content such as text or images into numerical form.

At a high level, the process is simple. Embeddings are stored in the database. When a query is received, a query embedding is generated from the input, and pgvector returns the records whose vectors are most similar, or closest in meaning to that query. Instead of matching keywords, results are retrieved based on meaning.

pgvector determines similarity using distance metrics:

  • L2 (Euclidean distance): measures the distance between vectors, where smaller values indicate greater similarity
  • Cosine similarity: measures how closely vectors point in the same direction, which often reflects similarity in meaning
  • Inner product: measures alignment between vectors and is often used with normalized embeddings

Key features of pgvector

pgvector includes several features that make vector search practical within Postgres.

  • Indexing: Two index types are supported: HNSW and IVFFlat. HNSW prioritizes query speed and builds a graph structure in memory, but requires more memory. IVFFlat is more memory-efficient and partitions vectors into clusters using a training step, but queries may be slower.
  • Distance metrics: L2, cosine similarity and inner product cover most embedding-based use cases. Hamming distance supports binary vectors, and Jaccard distance supports sparse vectors in more specialized scenarios.
  • Filtered search: Vector similarity can be combined with standard relational filters. For example, results can include the most semantically similar products that are also in stock, within a price range or in a specific category.
  • Hybrid search: pgvector can be paired with Postgres full-text search to blend keyword and semantic search. This allows results to be both contextually relevant and textually precise in a single query.
  • Additional data types:  Options such as halfvec, sparsevec and bit types help reduce memory usage when working with large embedding datasets.

Common use cases for pgvector

pgvector is widely used to power AI-driven application features:

Semantic search and RAG

Applications can retrieve documents or content based on meaning rather than keywords. This is a core component of retrieval-augmented generation (RAG), where large language models use retrieved context to generate accurate, relevant responses. Because pgvector runs similarity search directly within Postgres, this retrieval can happen in real time without requiring a separate system.

Recommendation systems

Items can be matched to past behavior or preferences to support recommendations. This pattern is commonly used for product recommendations, content discovery and personalization in applications. pgvector makes it efficient to identify related items based on patterns in user behavior or content.

Image similarity

Image embeddings can be stored and compared to quickly find visually similar images. This is widely used in media platforms, e-commerce and creative tools. Storing these embeddings alongside application data makes it easier to run similarity searches without additional infrastructure.

Anomaly detection

Outliers can be identified by finding data points that are distant from typical patterns in vector space. This is useful for fraud detection, monitoring and quality control. pgvector enables this by making it easy to compare vectors and detect deviations.

Deduplication

Duplicate or near-duplicate content can be identified, even when it is expressed differently or formatted in different ways. This is important for content management, search quality and data hygiene. Similarity-based comparison makes it possible to detect duplicates beyond exact matches.
 

pgvector vs. dedicated vector databases: when to use each

As vector search becomes part of more applications, teams often face a practical decision: should vector search stay within Postgres, or is a dedicated vector database needed? The answer depends on scale, performance requirements and operational complexity.

The differences can be summarized across key dimensions:

Tool

Operational Complexity

Scalability Ceiling

Hybrid Query Support

Cost

Ecosystem Maturity

pgvector

Lowest (Existing DB)

High (~100M+ vectors)

Best (Native SQL Joins)

Lowest (Included)

High (Postgres ecosystem)

Pinecone

Low (Serverless/SaaS)

Highest (Billions+)

Moderate (Metadata only)

High (Usage-based)

High (AI-specific)

Weaviate

Moderate (Multi-modal)

Very High

High (GraphQL/Vector)

Moderate

High (Open-source)

Qdrant

Moderate (Rust-based)

Very High

High (Filtering-heavy)

Moderate

Growing Fast

pgvector is the natural starting point for teams already using Postgres and operating below the scale ceiling. It works well when vector search is part of an existing application workflow and data volumes or query demands remain manageable. Dedicated vector databases become more relevant when query volume, recall requirements or multi-tenant workloads push beyond what Postgres can efficiently support.

pgvectorscale

pgvectorscale is designed for teams that want to extend how far they can go with pgvector before adopting a dedicated vector database. It addresses the performance and scalability challenges that arise as data volumes and query demands increase, particularly around indexing speed and query latency. By improving how pgvector performs at larger scales, it allows teams to continue using Postgres for longer without re-architecting their systems. This makes it a practical intermediate step for applications approaching the limits of what pgvector can handle on its own.

REPORT

The agentic AI playbook for the enterprise

Limitations and scaling considerations

pgvector is powerful, but it comes with tradeoffs:

  • Performance can degrade at very high vector counts (10M+) without additional optimization or tooling
    • HNSW indexes are memory-intensive, and large deployments may require significant RAM
    • Postgres does not provide built-in sharding for vector workloads, so horizontal scaling requires external tooling or a managed provider
    • Search speed and recall involve a real tradeoff. Recall — the percentage of truly relevant results that are returned — requires deliberate configuration to optimize.

Understanding these limitations helps determine when pgvector is sufficient and when additional infrastructure may be needed.

Getting started with pgvector

pgvector can be installed on macOS and most Linux distributions using standard package managers such as Homebrew. It is also available on many managed Postgres platforms, including AWS RDS, Supabase, Azure Database for PostgreSQL, Google Cloud SQL and Neon.

Installation and setup instructions are available in the official pgvector GitHub repository, which includes step-by-step guidance maintained by the project’s authors.

Databricks customers using Postgres can also reference the Databricks OLTP extensions docs for platform-specific guidance.

pgvector and the modern AI data stack

pgvector operates in the operational serving layer of an AI system, where low-latency retrieval is required at application runtime. It is commonly used to support semantic search, recommendations and retrieval-augmented generation (RAG) within applications.

In contrast, Databricks Mosaic AI Vector Search is better suited for large-scale, batch-processed AI workloads, where data pipelines are managed in the lakehouse. These environments support centralized data processing, large datasets and complex workflows.

These approaches are complementary, and teams often use both across different layers of the stack. pgvector supports real-time application queries, while platforms like Databricks handle large-scale data preparation, embedding generation and model-driven workflows.

Frequently asked questions

Is pgvector a full vector database?
pgvector enables Postgres to store embeddings and perform similarity search directly on that data. However, it is not a purpose-built vector database. Dedicated vector databases provide additional scalability and performance optimizations for larger workloads.

What is the difference between HNSW and IVFFlat in pgvector?
HNSW is optimized for fast query performance and uses an in-memory graph structure, which requires more memory. IVFFlat has a lower memory footprint and organizes vectors into clusters through a training step, but performance can vary depending on the dataset and workload. The choice depends on whether speed or memory efficiency is the priority.

How many vectors can pgvector handle?
pgvector can typically handle millions to tens of millions of vectors, depending on hardware, indexing strategy and query patterns. As datasets grow, performance may decline without careful tuning or additional tooling. Factors such as available memory, index type and query frequency all influence scalability.

Does pgvector support cosine similarity?
Yes, pgvector supports cosine similarity as one of its primary distance metrics. It measures how closely two vectors point in the same direction, which often reflects semantic similarity in embedding-based applications. This makes it well suited for semantic search, recommendation systems and natural language processing.

Is pgvector free and open source?
Yes, pgvector is an open-source project released under a permissive license. It can be used with standard Postgres installations as well as many managed Postgres services. This makes it an accessible starting point for adding vector search capabilities.

Can pgvector do hybrid search?
Yes, pgvector can be combined with Postgres full-text search to support hybrid search. This allows results to balance semantic relevance with keyword matching, improving both accuracy and usability. Hybrid search is especially useful in scenarios such as product search and documentation search, where both meaning and exact terms are important.

Choosing the right vector search approach

pgvector is a practical starting point for any team that wants to add vector search to an existing Postgres application. By storing embeddings alongside relational data and supporting similarity search natively within the database, it removes the operational overhead of managing a separate vector store. For many workloads — semantic search, RAG pipelines, recommendations and anomaly detection — it delivers what teams need without requiring a new system.

As data volumes grow or query demands increase, pgvectorscale can extend how far teams go before a dedicated vector database becomes necessary. For organizations managing large-scale AI workloads across a unified data platform, Databricks Mosaic AI Vector Search offers a complementary approach designed for the lakehouse layer. Together, these tools give teams the flexibility to match their vector search infrastructure to their actual workload requirements — at any scale.
 

Never miss a Databricks post

Subscribe to our blog and get the latest posts delivered to your inbox