Vector Databases for Product Search: pgvector, Pinecone, and Weaviate Compared

Choosing a vector database for your product search system? Practical comparison of pgvector (self-hosted), Pinecone (managed), and Weaviate (open source). Trade-offs, benchmarks, and when to use each.

Axoverna Team
9 min read

A vector database is the retrieval layer of a RAG system — it stores embeddings of your documents and retrieves the most relevant documents for a given query. The choice of vector database affects latency, cost, operational complexity, and what advanced features (filtering, re-ranking) are available to you.

This article evaluates three popular options for B2B product search, with real-world trade-offs and implementation guidance.

Quick Comparison Table

FactorpgvectorPineconeWeaviate
TypeSelf-hosted (PostgreSQL ext)Managed serviceOpen source / managed
Cost ModelInfra onlyPer-million queries + storageInfra + operational overhead
Setup Time1–2 hours5 minutes2–4 hours (self-hosted)
Latency<50ms (p50)50–150ms (p50)20–100ms (p50)
ScalingManual (add PostgreSQL replicas)AutomaticManual (Kubernetes)
Metadata FilteringFull SQL supportScoped to metadata fieldsGraphQL filters
Hybrid SearchBM25 built-in via pg_trgmVia Pinecone + external BM25Hybrid queries native
Best ForSmall–medium catalogs, PostgreSQL usersManaged simplicity, high trafficAdvanced filtering, custom deployments

pgvector: Self-Hosted, Postgres-Native

pgvector is an extension for PostgreSQL that adds vector data type and approximate nearest-neighbor (ANN) search using the HNSW algorithm. If you're already running Postgres for your application database, adding pgvector is straightforward.

Setup and Integration

-- Install pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;
 
-- Create embeddings table
CREATE TABLE product_embeddings (
  id SERIAL PRIMARY KEY,
  product_id VARCHAR(255) UNIQUE NOT NULL,
  embedding vector(1536),  -- For text-embedding-3-small
  chunk_content TEXT,
  metadata JSONB,
  created_at TIMESTAMP DEFAULT NOW()
);
 
-- Create index for ANN search (HNSW)
CREATE INDEX ON product_embeddings USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

Query:

SELECT 
  product_id,
  chunk_content,
  1 - (embedding <=> query_embedding) as similarity
FROM product_embeddings
WHERE metadata->>'category' = 'valves'  -- Metadata filtering
ORDER BY embedding <=> query_embedding  -- Cosine distance
LIMIT 10;

Advantages

Cost: No per-query fees. You pay for your Postgres infrastructure, period. For a product catalog with 100K products generating 10K queries/day, you're saving thousands per month versus Pinecone.

Metadata Filtering: Full SQL expressiveness. Filter on nested JSON, date ranges, numeric comparisons — anything SQL can express.

Hybrid Search: Combine vector similarity with PostgreSQL's full-text search (using tsvector) in a single query. BM25 scoring is achievable with additional extensions.

-- Hybrid search: semantic + full-text
SELECT 
  product_id,
  COALESCE(v.similarity, 0) * 0.7 + COALESCE(f.ts_rank, 0) * 0.3 as combined_score
FROM (
  SELECT product_id, 1 - (embedding <=> $1) as similarity
  FROM product_embeddings
  LIMIT 100
) v
LEFT JOIN (
  SELECT product_id, ts_rank(fts_index, query) as ts_rank
  FROM product_fts
  WHERE fts_index @@ query
) f USING (product_id)
ORDER BY combined_score DESC
LIMIT 10;

Operational Control: Data stays in your infrastructure. No vendor lock-in, no API rate limits, no network calls for every query.

Disadvantages

Scaling Complexity: pgvector works well for catalogs up to ~10M vectors on a single Postgres instance (depending on hardware). Beyond that, you need sharding or read replicas, which adds operational complexity. Pinecone handles this transparently.

Latency: A single Postgres instance will hit latency issues before a managed service. Typical latency is 50–150ms for a retrieval query, compared to Pinecone's 50–150ms but with less predictability under concurrent load.

Maintenance Burden: You're responsible for backups, upgrades, security patches, replication setup, and disaster recovery. Pinecone handles all of this.

ANN Quality Trade-offs: HNSW (the default in pgvector) has tunable parameters (m and ef_construction). Tuning these for your workload takes empirical testing. Pinecone abstracts this away.

When to Use pgvector

  • Existing Postgres users: If your application already runs on Postgres, adding pgvector to the same database is minimal friction.
  • Small–medium catalogs: Up to ~5–10 million vectors, depending on query QPS.
  • Cost-sensitive: You need to handle millions of queries and can't justify Pinecone's per-query pricing.
  • Custom filtering: Your retrieval logic requires complex SQL filters that don't map cleanly to Pinecone's metadata field syntax.

Operational Checklist

- [ ] Postgres 11.0+ (pgvector requires 11+)
- [ ] Install pgvector extension
- [ ] Plan embedding dimension (1536 for text-embedding-3-small)
- [ ] Create HNSW index with tuned m and ef_construction
- [ ] Implement batch ingestion for embeddings
- [ ] Set up connection pooling (PgBouncer) for high-concurrency reads
- [ ] Monitor query latency, set up alerting for slow queries
- [ ] Plan replication/backup strategy

Pinecone: Managed Vector Database

Pinecone is a cloud-hosted vector database built from the ground up for semantic search. You create an index, send embeddings, and query it. Scaling and maintenance are handled.

Setup and Integration

import pinecone
 
# Initialize
pinecone.init(api_key="your-api-key", environment="us-west1-gcp")
 
# Create index
pinecone.create_index(
    name="product-catalog",
    dimension=1536,              # For text-embedding-3-small
    metric="cosine",
    metadata_config={"indexed": ["category", "supplier", "price_range"]},
    spec=ServerlessSpec(cloud="gcp", region="us-west1")
)
 
# Upsert vectors
index = pinecone.Index("product-catalog")
index.upsert(vectors=[
    ("product-1-chunk-0", embedding_1, {"product_id": "1", "category": "valves"}),
    ("product-1-chunk-1", embedding_2, {"product_id": "1", "category": "valves"}),
])
 
# Query
results = index.query(
    vector=query_embedding,
    top_k=10,
    filter={"category": {"$eq": "valves"}},  # Metadata filter
    include_metadata=True
)

Advantages

Operational Simplicity: No infrastructure to manage. You create an index, configure it, and queries work. Pinecone handles replication, failover, and scaling.

Scaling: Pinecone automatically handles millions of vectors and thousands of QPS. You don't think about sharding or replication.

Built-in Hybrid Search: Pinecone's Hybrid Search API integrates BM25 lexical search with vector search, handling the merge and ranking for you.

# Hybrid search (requires text stored in metadata or a separate BM25 index)
results = index.query(
    vector=query_embedding,
    sparse_vector=bm25_sparse_vector,  # From sparse_encode_batch_documents
    alpha=0.7,  # Weight 70% vector, 30% sparse
    top_k=10
)

Query Speed: Pinecone achieves impressive latency (p95: <100ms) through highly optimized infrastructure.

Disadvantages

Cost: Pinecone charges per million API calls, typically $0.50–$2.00 per million depending on the plan. A product catalog with 100K products fielding 10K queries/day at 5 documents retrieved per query = 1.5M monthly requests = $750–$3,000/month. At scale, this becomes expensive.

Metadata Filtering Limits: You can filter on indexed metadata fields, but the filtering syntax is restrictive compared to SQL. Complex logic (OR operations, nested conditions) are awkward.

No Pure Full-Text Search: If you want to run full-text-only queries without vectors, Pinecone isn't the right tool. You need a separate BM25 index (Elasticsearch) to implement hybrid search.

Vendor Lock-in: Your embeddings are in Pinecone's indexes. Exporting and migrating to another vector database is non-trivial.

When to Use Pinecone

  • Managed simplicity: You don't want to operate Postgres or Kubernetes.
  • High-traffic, predictable QPS: Pinecone shines at 1,000+ QPS with consistent latency.
  • Hybrid search out of the box: If you want vector + BM25 in a single tool.
  • Early-stage product: You're more concerned with speed to market than cost optimization.

Weaviate: Open Source + Managed Hybrid

Weaviate is an open-source vector database written in Go, available both as a self-hosted deployment (Kubernetes) and as a managed cloud service. It's known for strong hybrid search support.

Setup and Integration (Managed Cloud)

import weaviate
from weaviate.auth import AuthApiKey
 
# Connect to cloud instance
client = weaviate.connect_to_weaviate_cloud(
    cluster_url="https://your-instance.weaviate.network",
    auth_credentials=AuthApiKey("your-api-key"),
)
 
# Define schema with hybrid search enabled
collection_definition = {
    "class": "ProductChunk",
    "properties": [
        {"name": "content", "dataType": ["text"]},
        {"name": "product_id", "dataType": ["string"]},
        {"name": "category", "dataType": ["string"]},
        {"name": "price", "dataType": ["number"]},
    ],
    "vectorizer": "text2vec-openai",  # Use OpenAI embeddings or bring your own
    "vectorizerConfig": {
        "apiVersion": "v1",
        "properties": ["content"]
    }
}
 
client.collections.create(collection_definition)
 
# Query with hybrid search
results = client.collections.get("ProductChunk").query.hybrid(
    query="valve for high pressure applications",
    where={
        "path": ["category"],
        "operator": "Equal",
        "valueString": "valves"
    },
    alpha=0.7,  # 70% semantic, 30% full-text
    limit=10
).objects

Advantages

Hybrid Search Native: Weaviate's hybrid queries combine BM25 and vector search in a single retrieval operation, with configurable weighting (alpha parameter).

GraphQL Interface: Queries are expressed in GraphQL, which is more expressive than REST APIs for complex filtering and field selection.

Self-Hosted Option: Run it on your own Kubernetes cluster if you need full control. This eliminates vendor lock-in concerns and can be cost-effective at scale.

Open Source: The codebase is available, so you can audit it, contribute, or maintain a fork if needed.

Disadvantages

Operational Complexity (Self-Hosted): Running Weaviate on Kubernetes requires Kubernetes expertise. Managed Weaviate Cloud is simpler but less common than Pinecone.

Smaller Ecosystem: Fewer pre-built integrations and fewer examples than Pinecone or pgvector.

Latency: Weaviate tends to have slightly higher latency than optimized pgvector setups, though lower than untuned Postgres deployments.

Documentation: While good, not as polished as Pinecone's.

When to Use Weaviate

  • Hybrid search is core: You want BM25 + vector search as a first-class feature.
  • Self-hosted preference: You want to run it in your own infrastructure on Kubernetes.
  • GraphQL preference: Your team is comfortable with GraphQL and wants that interface.
  • Open source: You value having the source code accessible.

Practical Decision Framework

Start here if you...

ScenarioChoice
Run PostgreSQL for your app, prefer single-system operationspgvector
Need minimum time-to-market, willing to pay per queryPinecone
Hybrid search is a must-have, want self-hosting optionWeaviate
Expect 10K+ queries/day, cost-consciouspgvector
Expect 1,000+ concurrent users, low operational tolerancePinecone
Building an internal tool with complex filteringpgvector

Implementation Reality Check

Most production deployments actually combine multiple systems:

Common Architecture: Use Pinecone for the primary product search (fast, managed, reliable) + maintain PostgreSQL as the source of truth with pgvector as a secondary index for analytics/debugging. Cost trade-off: Pinecone for user-facing queries, pgvector for internal systems.

Advanced Setup: Elasticsearch for BM25 + pgvector for vector search + application-level merge (RRF) = maximum control, maximum complexity.

For most B2B product catalogs with 50K–500K products, a single vector database (Pinecone or pgvector) is sufficient. The choice is about operational preference and cost tolerance, not capability.

Axoverna abstracts the vector database complexity → Use semantic search without choosing your infrastructure

Ready to get started?

Turn your product catalog into an AI knowledge base

Axoverna ingests your product data, builds a semantic search index, and gives you an embeddable chat widget — in minutes, not months.