Phase 9: RAG Implementation

Semantic Search

How vector embeddings and pgvector enable semantic product discovery that scales to 10,000+ products—far beyond the limitations of filter-based selection.

Overview

Semantic search uses vector embeddings to find products based on meaning rather than exact keyword matches. Instead of filtering by tags and attributes, we generate a mathematical representation of both the user's shopping intent and product characteristics, then find products with the highest similarity.

Why This Matters

Traditional filter-based product selection hits a wall at ~50 products. It requires manually curated tags, doesn't understand context, and can't handle catalog growth.

FILTER-BASED (V1-V2)

Matches exact tags like "rain" + "jacket"

Scales to: ~50 products

SEMANTIC (V3+)

Understands "waterproof outdoor gear"

Scales to: 10,000+ products

Implemented in Phase 9 of the project roadmap, semantic search is the foundation for production-scale deployment.

Architecture

Database Layer: pgvector

We use Supabase with the pgvector extension to store and query vector embeddings directly in PostgreSQL.

-- Migration: 004_add_product_embeddings.sql
CREATE EXTENSION IF NOT EXISTS vector;

ALTER TABLE product_catalog
  ADD COLUMN embedding vector(1536),          -- OpenAI embedding dimension
  ADD COLUMN embedding_model TEXT,            -- 'text-embedding-3-small'
  ADD COLUMN embedding_created_at TIMESTAMPTZ;

-- Vector similarity index (ivfflat for 1k-10k products)
CREATE INDEX idx_product_catalog_embedding ON product_catalog
  USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);

Embedding Generation

Product embeddings are generated using OpenAI's text-embedding-3-small model. Each embedding is a 1536-dimension vector that captures the semantic meaning of the product.

// What gets embedded:
function buildEmbeddingText(product) {
  return `
    Name: ${product.name}
    Description: ${product.description}
    Category: ${product.category}
    Department: ${product.department}
    Tags: ${product.tags.join(', ')}
    Specs: ${product.attributes}
    Highlights: ${product.spec_highlights.join(', ')}
    Occasions: ${product.occasions.join(', ')}
    Badges: ${product.badges.join(', ')}
  `;
}

// Cost: ~$0.000004 per product (very cheap!)
const embedding = await openai.embeddings.create({
  model: 'text-embedding-3-small',
  input: embeddingText
});

Search Function: match_products()

The database includes a custom RPC function that performs vector similarity search with filtering.

CREATE FUNCTION match_products(
  query_embedding vector(1536),
  match_threshold float DEFAULT 0.7,
  match_count int DEFAULT 20,
  filter_brand_id TEXT DEFAULT NULL,
  filter_department department DEFAULT NULL,
  min_price_range price_range DEFAULT NULL,
  max_price_range price_range DEFAULT NULL,
  min_stock INTEGER DEFAULT NULL
)
RETURNS TABLE (
  id TEXT, name TEXT, description TEXT,
  price DECIMAL, similarity float,
  persona_fit_hunter DECIMAL, persona_fit_gatherer DECIMAL,
  persona_fit_researcher DECIMAL, persona_fit_gifter DECIMAL,
  -- ...and more fields
)
AS $$
  SELECT *,
    1 - (embedding <=> query_embedding) AS similarity
  FROM product_catalog
  WHERE embedding IS NOT NULL
    AND (1 - (embedding <=> query_embedding)) >= match_threshold
    -- Apply filters...
  ORDER BY embedding <=> query_embedding
  LIMIT match_count;
$$;

How It Works

Build Query Context

Combine persona and shopping context into a semantic query:

"Shopping intent: fast purchase, in stock, essential items
Context: waterproof, indoor, cozy, warm
Weather needs: waterproof, indoor, cozy, warm
Urgency: urgent, fast shipping, in stock now"

Generate Query Embedding

OpenAI converts the query text into a 1536-dimension vector

Cost: ~$0.00001 per query

Vector Similarity Search

pgvector finds products with the most similar embeddings using cosine distance

Index: ivfflat for fast approximate search

Filter by Persona Fit

Results are filtered by minimum persona fit score (e.g., persona_fit_hunter >= 0.2)

Sort and Return

Products sorted by persona fit, with preferred brands boosted in mixed mode

Integration with Layout Generation

Semantic search is integrated into the generateLayout() function with automatic fallback to filter-based search if embeddings aren't available.

// src/lib/ai/generate-layout.ts
export async function generateLayout(
  persona: Persona,
  context: ShoppingContext,
  options: GenerateOptions
): Promise<LayoutResponse> {
  // Check if semantic search is available
  const semanticAvailable = await isSemanticSearchAvailable();

  let selectedProducts: Product[];

  if (semanticAvailable) {
    // Use semantic search with persona and context
    selectedProducts = await semanticSearch(persona, context, {
      limit: persona === 'goal-oriented' ? 50 : 30,
      minStock: 1,
      minPersonaFit: 0.2,
      preferredBrands: context.preferredBrands,
      brandMode: context.brandMode,
    });
  } else {
    // Fallback to filter-based selection
    selectedProducts = filterInventory(allProducts, persona, context);
  }

  // Continue with layout generation...
}

Performance & Caching

Query cache TTL

Results cached for repeated queries

$0.00001

Cost per query

OpenAI embedding generation

>2s

Slow query threshold

Logged for optimization

Query results are cached with a 2-second TTL to avoid redundant API calls for identical queries. Slow queries (over 2 seconds) are logged for performance analysis.

// Query caching implementation
const cacheKey = getSemanticSearchCacheKey(persona, context, options);
const cached = getCachedQuery<Product[]>(cacheKey);

if (cached) {
  return cached; // Return cached results
}

// ... perform semantic search ...

// Cache results
cacheQueryResult(cacheKey, results, QUERY_CACHE_TTL.SEMANTIC_SEARCH);

Setup & Migration

To enable semantic search in your SIX instance, you need to:

1. Run Database Migrations

Apply migration 004_add_product_embeddings.sql to add the vector column and search function

supabase/migrations/004_add_product_embeddings.sql

2. Generate Embeddings

Run the embedding generation script to create vectors for all products

node scripts/product-extraction/generate-embeddings.mjs

3. Set Environment Variables

Ensure OpenAI API key is configured for embedding generation

OPENAI_API_KEY=sk-...

📖 Full migration guide: See /docs/MIGRATION_GUIDE.md for detailed setup instructions, troubleshooting, and verification steps.

Checking Availability

Use the isSemanticSearchAvailable() function to check if embeddings exist before attempting semantic search:

import { isSemanticSearchAvailable, semanticSearch } from '@/lib/inventory/rag';

// Check availability
const available = await isSemanticSearchAvailable();

if (available) {
  // Use semantic search
  const products = await semanticSearch(persona, context, options);
} else {
  // Fallback to filter-based search
  const products = filterInventory(allProducts, persona, context);
}

Version Availability

Semantic search is available in the following versions:

V1 & V2

Not Available

Filter-based product selection only

Available

Edge-native with semantic search

Available

Trend-aware with semantic search

Available

Latest version with semantic search

Spotify Loop

Inventory Physics