Vector Databases for Product Search: pgvector, Pinecone, and Weaviate Compared

Choosing a vector database for your product search system? Practical comparison of pgvector (self-hosted), Pinecone (managed), and Weaviate (open source). Trade-offs, benchmarks, and when to use each.

Axoverna Team

February 25, 20269 min read

A vector database is the retrieval layer of a RAG system — it stores embeddings of your documents and retrieves the most relevant documents for a given query. The choice of vector database affects latency, cost, operational complexity, and what advanced features (filtering, re-ranking) are available to you.

This article evaluates three popular options for B2B product search, with real-world trade-offs and implementation guidance.

Quick Comparison Table

Factor	pgvector	Pinecone	Weaviate
Type	Self-hosted (PostgreSQL ext)	Managed service	Open source / managed
Cost Model	Infra only	Per-million queries + storage	Infra + operational overhead
Setup Time	1–2 hours	5 minutes	2–4 hours (self-hosted)
Latency	<50ms (p50)	50–150ms (p50)	20–100ms (p50)
Scaling	Manual (add PostgreSQL replicas)	Automatic	Manual (Kubernetes)
Metadata Filtering	Full SQL support	Scoped to metadata fields	GraphQL filters
Hybrid Search	BM25 built-in via pg_trgm	Via Pinecone + external BM25	Hybrid queries native
Best For	Small–medium catalogs, PostgreSQL users	Managed simplicity, high traffic	Advanced filtering, custom deployments

pgvector: Self-Hosted, Postgres-Native

pgvector is an extension for PostgreSQL that adds vector data type and approximate nearest-neighbor (ANN) search using the HNSW algorithm. If you're already running Postgres for your application database, adding pgvector is straightforward.

Setup and Integration

-- Install pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;
 
-- Create embeddings table
CREATE TABLE product_embeddings (
  id SERIAL PRIMARY KEY,
  product_id VARCHAR(255) UNIQUE NOT NULL,
  embedding vector(1536),  -- For text-embedding-3-small
  chunk_content TEXT,
  metadata JSONB,
  created_at TIMESTAMP DEFAULT NOW()
);
 
-- Create index for ANN search (HNSW)
CREATE INDEX ON product_embeddings USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

Query:

SELECT 
  product_id,
  chunk_content,
  1 - (embedding <=> query_embedding) as similarity
FROM product_embeddings
WHERE metadata->>'category' = 'valves'  -- Metadata filtering
ORDER BY embedding <=> query_embedding  -- Cosine distance
LIMIT 10;

Advantages

Cost: No per-query fees. You pay for your Postgres infrastructure, period. For a product catalog with 100K products generating 10K queries/day, you're saving thousands per month versus Pinecone.

Metadata Filtering: Full SQL expressiveness. Filter on nested JSON, date ranges, numeric comparisons — anything SQL can express.

Hybrid Search: Combine vector similarity with PostgreSQL's full-text search (using tsvector) in a single query. BM25 scoring is achievable with additional extensions.

-- Hybrid search: semantic + full-text
SELECT 
  product_id,
  COALESCE(v.similarity, 0) * 0.7 + COALESCE(f.ts_rank, 0) * 0.3 as combined_score
FROM (
  SELECT product_id, 1 - (embedding <=> $1) as similarity
  FROM product_embeddings
  LIMIT 100
) v
LEFT JOIN (
  SELECT product_id, ts_rank(fts_index, query) as ts_rank
  FROM product_fts
  WHERE fts_index @@ query
) f USING (product_id)
ORDER BY combined_score DESC
LIMIT 10;

Operational Control: Data stays in your infrastructure. No vendor lock-in, no API rate limits, no network calls for every query.

Disadvantages

Scaling Complexity: pgvector works well for catalogs up to ~10M vectors on a single Postgres instance (depending on hardware). Beyond that, you need sharding or read replicas, which adds operational complexity. Pinecone handles this transparently.

Latency: A single Postgres instance will hit latency issues before a managed service. Typical latency is 50–150ms for a retrieval query, compared to Pinecone's 50–150ms but with less predictability under concurrent load.

Maintenance Burden: You're responsible for backups, upgrades, security patches, replication setup, and disaster recovery. Pinecone handles all of this.

ANN Quality Trade-offs: HNSW (the default in pgvector) has tunable parameters (m and ef_construction). Tuning these for your workload takes empirical testing. Pinecone abstracts this away.

When to Use pgvector

Existing Postgres users: If your application already runs on Postgres, adding pgvector to the same database is minimal friction.
Small–medium catalogs: Up to ~5–10 million vectors, depending on query QPS.
Cost-sensitive: You need to handle millions of queries and can't justify Pinecone's per-query pricing.
Custom filtering: Your retrieval logic requires complex SQL filters that don't map cleanly to Pinecone's metadata field syntax.

Operational Checklist

- [ ] Postgres 11.0+ (pgvector requires 11+)
- [ ] Install pgvector extension
- [ ] Plan embedding dimension (1536 for text-embedding-3-small)
- [ ] Create HNSW index with tuned m and ef_construction
- [ ] Implement batch ingestion for embeddings
- [ ] Set up connection pooling (PgBouncer) for high-concurrency reads
- [ ] Monitor query latency, set up alerting for slow queries
- [ ] Plan replication/backup strategy

Pinecone: Managed Vector Database

Pinecone is a cloud-hosted vector database built from the ground up for semantic search. You create an index, send embeddings, and query it. Scaling and maintenance are handled.

Setup and Integration

import pinecone
 
# Initialize
pinecone.init(api_key="your-api-key", environment="us-west1-gcp")
 
# Create index
pinecone.create_index(
    name="product-catalog",
    dimension=1536,              # For text-embedding-3-small
    metric="cosine",
    metadata_config={"indexed": ["category", "supplier", "price_range"]},
    spec=ServerlessSpec(cloud="gcp", region="us-west1")
)
 
# Upsert vectors
index = pinecone.Index("product-catalog")
index.upsert(vectors=[
    ("product-1-chunk-0", embedding_1, {"product_id": "1", "category": "valves"}),
    ("product-1-chunk-1", embedding_2, {"product_id": "1", "category": "valves"}),
])
 
# Query
results = index.query(
    vector=query_embedding,
    top_k=10,
    filter={"category": {"$eq": "valves"}},  # Metadata filter
    include_metadata=True
)

Advantages

Operational Simplicity: No infrastructure to manage. You create an index, configure it, and queries work. Pinecone handles replication, failover, and scaling.

Scaling: Pinecone automatically handles millions of vectors and thousands of QPS. You don't think about sharding or replication.

Built-in Hybrid Search: Pinecone's Hybrid Search API integrates BM25 lexical search with vector search, handling the merge and ranking for you.

# Hybrid search (requires text stored in metadata or a separate BM25 index)
results = index.query(
    vector=query_embedding,
    sparse_vector=bm25_sparse_vector,  # From sparse_encode_batch_documents
    alpha=0.7,  # Weight 70% vector, 30% sparse
    top_k=10
)

Query Speed: Pinecone achieves impressive latency (p95: <100ms) through highly optimized infrastructure.

Disadvantages

Cost: Pinecone charges per million API calls, typically $0.50–$2.00 per million depending on the plan. A product catalog with 100K products fielding 10K queries/day at 5 documents retrieved per query = 1.5M monthly requests = $750–$3,000/month. At scale, this becomes expensive.

Metadata Filtering Limits: You can filter on indexed metadata fields, but the filtering syntax is restrictive compared to SQL. Complex logic (OR operations, nested conditions) are awkward.

No Pure Full-Text Search: If you want to run full-text-only queries without vectors, Pinecone isn't the right tool. You need a separate BM25 index (Elasticsearch) to implement hybrid search.

Vendor Lock-in: Your embeddings are in Pinecone's indexes. Exporting and migrating to another vector database is non-trivial.

When to Use Pinecone

Managed simplicity: You don't want to operate Postgres or Kubernetes.
High-traffic, predictable QPS: Pinecone shines at 1,000+ QPS with consistent latency.
Hybrid search out of the box: If you want vector + BM25 in a single tool.
Early-stage product: You're more concerned with speed to market than cost optimization.

Weaviate: Open Source + Managed Hybrid

Weaviate is an open-source vector database written in Go, available both as a self-hosted deployment (Kubernetes) and as a managed cloud service. It's known for strong hybrid search support.

Setup and Integration (Managed Cloud)

import weaviate
from weaviate.auth import AuthApiKey
 
# Connect to cloud instance
client = weaviate.connect_to_weaviate_cloud(
    cluster_url="https://your-instance.weaviate.network",
    auth_credentials=AuthApiKey("your-api-key"),
)
 
# Define schema with hybrid search enabled
collection_definition = {
    "class": "ProductChunk",
    "properties": [
        {"name": "content", "dataType": ["text"]},
        {"name": "product_id", "dataType": ["string"]},
        {"name": "category", "dataType": ["string"]},
        {"name": "price", "dataType": ["number"]},
    ],
    "vectorizer": "text2vec-openai",  # Use OpenAI embeddings or bring your own
    "vectorizerConfig": {
        "apiVersion": "v1",
        "properties": ["content"]
    }
}
 
client.collections.create(collection_definition)
 
# Query with hybrid search
results = client.collections.get("ProductChunk").query.hybrid(
    query="valve for high pressure applications",
    where={
        "path": ["category"],
        "operator": "Equal",
        "valueString": "valves"
    },
    alpha=0.7,  # 70% semantic, 30% full-text
    limit=10
).objects

Advantages

Hybrid Search Native: Weaviate's hybrid queries combine BM25 and vector search in a single retrieval operation, with configurable weighting (alpha parameter).

GraphQL Interface: Queries are expressed in GraphQL, which is more expressive than REST APIs for complex filtering and field selection.

Self-Hosted Option: Run it on your own Kubernetes cluster if you need full control. This eliminates vendor lock-in concerns and can be cost-effective at scale.

Open Source: The codebase is available, so you can audit it, contribute, or maintain a fork if needed.

Disadvantages

Operational Complexity (Self-Hosted): Running Weaviate on Kubernetes requires Kubernetes expertise. Managed Weaviate Cloud is simpler but less common than Pinecone.

Smaller Ecosystem: Fewer pre-built integrations and fewer examples than Pinecone or pgvector.

Latency: Weaviate tends to have slightly higher latency than optimized pgvector setups, though lower than untuned Postgres deployments.

Documentation: While good, not as polished as Pinecone's.

When to Use Weaviate

Hybrid search is core: You want BM25 + vector search as a first-class feature.
Self-hosted preference: You want to run it in your own infrastructure on Kubernetes.
GraphQL preference: Your team is comfortable with GraphQL and wants that interface.
Open source: You value having the source code accessible.

Practical Decision Framework

Start here if you...

Scenario	Choice
Run PostgreSQL for your app, prefer single-system operations	pgvector
Need minimum time-to-market, willing to pay per query	Pinecone
Hybrid search is a must-have, want self-hosting option	Weaviate
Expect 10K+ queries/day, cost-conscious	pgvector
Expect 1,000+ concurrent users, low operational tolerance	Pinecone
Building an internal tool with complex filtering	pgvector

Implementation Reality Check

Most production deployments actually combine multiple systems:

Common Architecture: Use Pinecone for the primary product search (fast, managed, reliable) + maintain PostgreSQL as the source of truth with pgvector as a secondary index for analytics/debugging. Cost trade-off: Pinecone for user-facing queries, pgvector for internal systems.

Advanced Setup: Elasticsearch for BM25 + pgvector for vector search + application-level merge (RRF) = maximum control, maximum complexity.

For most B2B product catalogs with 50K–500K products, a single vector database (Pinecone or pgvector) is sufficient. The choice is about operational preference and cost tolerance, not capability.

Axoverna abstracts the vector database complexity → Use semantic search without choosing your infrastructure

Ready to get started?

Turn your product catalog into an AI knowledge base

Axoverna ingests your product data, builds a semantic search index, and gives you an embeddable chat widget — in minutes, not months.

Start free — no credit card required →Read the docs

Technical

BOM-Aware Product AI: How to Turn Part-Level Questions Into Procurement-Ready Answers

Most product AI systems answer one SKU at a time. B2B buyers work from assemblies, spare parts lists, and bills of materials. BOM-aware retrieval helps AI reason across sets of parts, dependencies, alternates, and order constraints so conversations lead to real purchasing decisions.

May 24, 202611 min read

Technical

Revenue-Weighted Evaluation for B2B Product AI: Why All Retrieval Errors Are Not Equal

Most B2B teams evaluate product AI with flat accuracy metrics. The better approach is to weight failures by commercial risk, so mistakes on high-value, high-complexity workflows get fixed before low-stakes browsing errors.

May 23, 202611 min read

Technical

How Conversation Mining Turns Product AI Into a Product Data Improvement Engine

Most B2B teams treat AI chat logs as support exhaust. The smarter move is to mine them for missing attributes, broken mappings, unclear terminology, and catalog blind spots, then feed those insights back into product data operations.

May 22, 202612 min read

Quick Comparison Table

pgvector: Self-Hosted, Postgres-Native

Setup and Integration

Advantages

Disadvantages

When to Use pgvector

Operational Checklist

Pinecone: Managed Vector Database

Setup and Integration

Advantages

Disadvantages

When to Use Pinecone

Weaviate: Open Source + Managed Hybrid

Setup and Integration (Managed Cloud)

Advantages

Disadvantages

When to Use Weaviate

Practical Decision Framework

Implementation Reality Check

Turn your product catalog into an AI knowledge base

Related articles

BOM-Aware Product AI: How to Turn Part-Level Questions Into Procurement-Ready Answers

Revenue-Weighted Evaluation for B2B Product AI: Why All Retrieval Errors Are Not Equal

How Conversation Mining Turns Product AI Into a Product Data Improvement Engine