Metadata Filtering in RAG: When to Filter, When to Embed, and Why the Difference Matters

Embedding product attributes like price, category, and stock status into vectors is a common mistake that quietly destroys retrieval precision. Here's how to design a hybrid structured-unstructured retrieval architecture that handles both dimensions correctly.

Axoverna Team
14 min read

There's a mistake quietly degrading the retrieval quality of many B2B product AI systems, and it's subtle enough that it often goes unnoticed until someone looks at the actual retrieval logs.

The mistake: treating structured product attributes the same way you treat unstructured product descriptions.

A product's dimensions, price tier, category hierarchy, stock status, and compliance certifications are structured data — discrete values with defined semantics. A product's features paragraph, application notes, and technical explanation are unstructured data — prose that carries meaning through language.

Embedding both into the same vector space and hoping the model figures it out is like putting your invoice system into a word processor. The tool technically holds the data, but it's working against its own design.

This article is a practical guide to metadata filtering in RAG: what it is, when to use it versus embedding-based retrieval, how to implement it in a product catalog context, and the specific failure modes you'll avoid by getting this right.


The Problem With Embedding Everything

Before getting into solutions, it's worth being precise about what breaks when you embed structured attributes.

Consider a buyer query: "Show me stainless steel compression fittings rated for at least 150 bar, available in stock."

This query has two distinct components:

  1. Semantic content: stainless steel compression fittings — meaning-rich, benefits from vector similarity
  2. Structured constraints: rated for at least 150 bar, available in stock — boolean and numeric logic, not semantic

When you run this purely through vector search, several things go wrong:

Problem 1: Numeric comparisons don't embed cleanly. Embedding models learn from co-occurrence patterns in text. "150 bar" and "200 bar" are semantically very similar — they're both pressure ratings in the same domain. A vector search for "at least 150 bar" will happily return products rated 80 bar because the description text is otherwise similar. The model has no concept of greater than.

Problem 2: Boolean attributes pollute the semantic space. If "in stock" is embedded as part of a chunk, items without that phrase might be deprioritized even when they're perfectly relevant. Worse, stock status changes — and you'd need to re-embed your entire catalog every time availability shifts.

Problem 3: Categorical hierarchies become fuzzy. If a buyer filters by category "Industrial Valves > Ball Valves > 3-Piece", that's a precise hierarchical selection. Vector search treats it as a semantic preference, which means related-but-wrong categories (butterfly valves, gate valves) bleed into results even when the buyer wanted something exact.

Problem 4: High-cardinality facets overwhelm the embedding. A product chunk that includes SKU, manufacturer code, compliance certifications, lead time, and minimum order quantity has most of its token budget consumed by structured metadata — leaving less signal for the semantic content that actually drives retrieval quality.

The short version: vectors are for meaning; filters are for facts. Using vectors for facts is the wrong tool for the job.


What Metadata Filtering Actually Is

Metadata filtering (sometimes called pre-filtering or filtered vector search) means attaching structured attribute fields to each indexed chunk and using those fields as hard constraints before or during the vector similarity ranking.

Most modern vector databases — Pinecone, Qdrant, Weaviate, pgvector, Milvus — support this natively. The query becomes: find the top-K most semantically similar chunks from the subset that matches these structured conditions.

Here's the conceptual anatomy of a filtered RAG query:

Vector Query: "compression fittings for high-pressure hydraulic lines"

Metadata Filters:
  category = "Compression Fittings"
  material IN ["304 SS", "316 SS", "Stainless Steel"]
  pressure_rating_bar >= 150
  in_stock = true

Result: Top-5 semantically relevant chunks from products that pass ALL filters

The vector search does what it's good at (semantic matching). The structured filters do what they're good at (exact constraints). Neither is asked to do the other's job.


Designing Your Metadata Schema

The first practical challenge is deciding which attributes belong in metadata versus which belong in the embedded text. Here's a framework.

Attributes that should always be metadata (never embedded)

Numeric ranges with comparative logic:

  • Pressure ratings, temperature ratings, voltage, current, load capacity
  • Dimensions (length, diameter, thread pitch)
  • Price and price tiers
  • Minimum order quantities
  • Lead times

Boolean/categorical states:

  • In stock / out of stock / discontinued
  • Hazardous material flag
  • RoHS compliant, REACH compliant, FDA approved
  • Made in EU / country of origin

Hierarchical taxonomy:

  • Category paths (e.g., ["Fasteners", "Bolts", "Hex Bolts", "Metric"])
  • Brand / manufacturer
  • Product line or series

High-cardinality identifiers:

  • SKU, part number, EAN/UPC
  • Manufacturer part number (MPN)
  • Supplier codes

Attributes that should be embedded (in the chunk text)

Descriptive features:

  • Application descriptions ("suitable for food-grade processing environments")
  • Material characteristics ("resistant to UV degradation, flexible at low temperatures")
  • Installation notes and compatibility information
  • Benefits and differentiators

Technical narrative:

  • How the product works
  • Why it's designed a certain way
  • Common use cases described in natural language

FAQs and support content:

  • Common questions about the product
  • Troubleshooting notes
  • Comparison notes ("unlike standard brass valves, this...")

The gray zone: attributes that belong in both

Some attributes should appear in both the metadata and the embedded text. Technical specifications that might be queried semantically ("high-pressure fitting") and precisely (">=150 bar") should be in metadata as a structured field and mentioned naturally in the chunk text.

When a buyer asks "what fittings work under high pressure?" they're using semantic language — the text embedding will find relevant chunks. When they ask "which fittings are rated for 150 bar minimum?" they're using a filter — the structured field handles it.

Doubling up here is intentional and correct. The metadata is the source of truth for the constraint; the text is the source for semantic matching.


Implementing Metadata Filtering: A Practical Architecture

Let's walk through how this looks end-to-end in a B2B product knowledge system.

Step 1: Enrich your chunks with structured payloads

When you ingest a product and split it into chunks (see our guide on document chunking for RAG), attach a metadata payload to each chunk — not just to the product record. This matters because different chunks of the same product may be retrieved in different query contexts, and you want the filters to apply at the chunk level.

interface ProductChunk {
  id: string
  text: string          // what gets embedded
  embedding: number[]   // the vector
  metadata: {
    product_id: string
    sku: string
    category_path: string[]
    brand: string
    material: string[]
    pressure_rating_bar?: number
    temperature_min_c?: number
    temperature_max_c?: number
    in_stock: boolean
    discontinued: boolean
    price_tier: 'economy' | 'standard' | 'premium'
    certifications: string[]
    chunk_type: 'description' | 'specs' | 'applications' | 'faq'
  }
}

Notice chunk_type in the metadata. This lets you filter not just by product attributes, but by the kind of content. A query asking "how do I install this?" should preferentially retrieve applications and faq chunks, not specs chunks.

Step 2: Build a query parser that extracts filter intent

Your system needs to recognize when a user's query contains structured constraints and extract them into filter form. This can be done with:

  • Rule-based extraction: regex or NER for part numbers, known units (bar, PSI, °C), explicit operators ("at least", "minimum", "rated for")
  • LLM-based extraction: a lightweight classification step that parses the query into semantic + structured components
  • Hybrid: LLM for intent classification, rules for value normalization

A simple LLM extraction prompt:

Given the user query, extract any structured constraints:
- Numeric ranges (min/max values with units)  
- Category preferences
- Material requirements
- Stock/availability requirements
- Certification requirements

Query: "I need stainless steel ball valves rated for at least 40 bar, in stock"

Output: {
  "semantic_query": "stainless steel ball valves",
  "filters": {
    "material": ["stainless steel"],
    "category_contains": "ball valves",
    "pressure_rating_bar": {"gte": 40},
    "in_stock": true
  }
}

The semantic_query goes to your embedding model. The filters go to your vector database query. They run together — most vector databases support this natively in a single call.

Step 3: Handle filter failures gracefully

What happens when a buyer asks for something very specific and the filter returns zero results? This is a retrieval dead-end that must be handled explicitly.

Good approaches:

Graceful relaxation: If pressure_rating_bar >= 150 returns nothing, try >= 120, then >= 100, and surface the best match with a caveat: "We don't have exact 150 bar rated items in stock, but here's the closest option at 120 bar."

Transparent no-results: Tell the user explicitly that no products match the constraints, and offer to show similar products without the constraint. Don't hallucinate a product that doesn't exist.

Filter explanation: If results are sparse after filtering, explain why. "We found 2 products matching all your criteria. If you're flexible on the material, here are 14 more options."

The worst outcome is silently discarding the filter and returning semantically similar products without noting they don't meet the constraint. A buyer who needed 150-bar products and receives 80-bar products — with no warning — has been actively misled.

Step 4: Surface filter context in your response

The AI response should acknowledge the filters that were applied. This builds trust (see our article on building trust in AI responses) and helps users understand why specific results appeared.

Compare:

❌ "Here are some compression fittings that might work for you."

✅ "Here are the 4 stainless steel compression fittings in our catalog rated for 150 bar or above that are currently in stock. The Parker A-Lok series is particularly popular for hydraulic applications at this pressure range."

The second version makes clear that filters were applied, which validates the user's query and builds confidence in the system's accuracy.


Keeping Metadata Fresh

Unlike embedded text — which is relatively static — structured metadata changes constantly. Prices update. Stock levels flip. Certifications expire. A product gets discontinued.

This is one of the strongest arguments for separating metadata from embeddings: you can update metadata without re-embedding.

In a well-designed product RAG system:

  • Catalog sync (descriptions, specs changes) triggers re-chunking and re-embedding — expensive, but infrequent
  • Inventory sync (stock status, price updates) updates only the metadata payload — cheap and can run continuously

If you've read our piece on keeping your product catalog sync fresh, you'll recognize this as the same principle applied to the retrieval layer: different types of product data have different freshness requirements, and your architecture should handle each appropriately.

For stock status specifically, many teams run an inventory sync every 5-15 minutes — updating just the in_stock boolean in the metadata store. No re-embedding, no disruption to retrieval quality. This is only possible if stock status is a filter field rather than embedded in the text.


Real-World Metadata Filtering Patterns

Here are three practical patterns that appear repeatedly in production B2B product AI systems.

Pattern 1: Category pre-filtering + semantic search within

For distributors with deep category trees (10,000+ products across 200+ categories), the most impactful optimization is often the simplest: pre-filter by category before running vector search.

If a buyer is clearly in the "hydraulic fittings" section of your site, filter to that subtree first. Semantic search within 800 relevant products is more accurate than semantic search across your entire 15,000-product catalog. You've narrowed the search space without losing any relevant results (assuming category taxonomy is clean).

Pattern 2: Certification filtering for compliance-critical buyers

Industrial, medical, food-grade, and chemical sectors often have non-negotiable compliance requirements. A buyer sourcing components for a food processing line who asks "what valves do you have?" implicitly needs FDA-compliant materials — but they may not state it explicitly.

If your system has workspace-level configuration (i.e., you know which industry the customer operates in), apply compliance filters proactively:

  • Chemical sector: REACH, RoHS
  • Food/pharma: FDA, NSF 61, 3-A Sanitary
  • Aerospace: AS9100, NADCAP qualified suppliers
  • Construction: CE marking, EN standards

This is a form of context-aware filtering — the user doesn't have to remember to ask for compliance; the system applies it based on known context. It's also a significant differentiator from generic search.

Pattern 3: Price tier and minimum order quantity filtering

B2B purchasing often has organizational rules: approved price tiers, minimum order quantities, preferred supplier lists. If you have customer-level context (e.g., the buyer is logged in and has a negotiated price tier), apply those constraints as hard filters automatically.

This isn't just convenient — it prevents the AI from suggesting products the buyer literally cannot order under their contract terms. Surfacing "this product isn't available at your price tier" as a filter result, rather than having the AI confidently recommend an off-contract item, is the difference between a useful tool and a liability.


Evaluating Your Metadata Filtering

How do you know if you've got this right? Three things to measure:

1. Filter hit rate: What percentage of queries result in at least one filter being applied? For a B2B product catalog with rich structured attributes, you'd expect 30-60% of queries to have at least one extractable structured constraint. If it's near zero, your query parser isn't extracting constraints effectively.

2. Filter precision: When filters are applied, what percentage of retrieved results actually satisfy all constraints? This should be 100% — if a metadata filter is applied and a result violates it, something is wrong with your indexing or filter logic.

3. Zero-result rate: What percentage of filtered queries return no results? Some zero-result rate is expected and correct (the product doesn't exist in your catalog). High zero-result rates usually indicate overly aggressive filter extraction — the parser is applying constraints the user didn't intend.


Common Mistakes to Avoid

Embedding price as text: "This product is priced at €149.90" embedded in a chunk means every price change requires re-embedding. Store price as a metadata field and update it independently.

Flattening category hierarchies: Storing category as a single string ("Industrial > Hydraulics > Compression Fittings") makes it hard to filter at intermediate levels. Store as an array and support contains-at-level queries.

Over-filtering on the first pass: Applying too many filters simultaneously narrows results to zero. Consider a two-pass approach: strict filters first, then graceful relaxation with transparency.

Ignoring chunk-level vs. product-level metadata: All chunks from the same product should inherit the product's structured attributes. A chunk from the FAQs section should still have in_stock: true / false — otherwise filtered queries may fail to retrieve relevant support content.

Treating metadata filtering as a one-time setup: Metadata schemas need to evolve as your catalog evolves. Plan for schema versioning from the start.


Putting It Together

The architecture that emerges from these principles is a clean separation of concerns:

  • Embedding index: semantic meaning, descriptions, application notes, unstructured features
  • Metadata store: structured attributes, numeric specs, boolean states, categories, identifiers
  • Query layer: intelligent parser that splits user intent into semantic + structured components
  • Retrieval layer: filtered vector search that applies both simultaneously
  • Response layer: surfaces which constraints were applied, handles graceful degradation

This is more engineering than a pure "stuff everything into vectors" approach — but it's also the architecture that handles real-world B2B catalog queries accurately. The messy, constraint-heavy, attribute-rich queries that your buyers actually send are precisely the queries where metadata filtering makes the difference between a system buyers trust and one they abandon after three tries.

If you're building a product AI today, start with metadata filtering as a first-class design concern — not an afterthought. The investment pays off the first time a buyer asks for something specific and gets exactly what they need.


See How Axoverna Handles It

Axoverna's product knowledge platform is built around this separation from the ground up. Structured product attributes sync independently from semantic content, metadata filters apply automatically based on query context, and the system degrades gracefully when constraints can't be fully satisfied — always telling buyers what it did and why.

If your current product search returns irrelevant results for attribute-heavy queries, or if your AI assistant can't reliably handle "minimum 150 bar, in stock" style questions — it's likely a metadata architecture problem, not a model problem.

Request a demo to see how Axoverna handles complex structured queries against real B2B product catalogs, or explore our blog for more deep-dives on building product AI that actually works.

Ready to get started?

Turn your product catalog into an AI knowledge base

Axoverna ingests your product data, builds a semantic search index, and gives you an embeddable chat widget — in minutes, not months.