Phase 9: RAG Implementation
Semantic Search
How vector embeddings and pgvector enable semantic product discovery that scales to 10,000+ products—far beyond the limitations of filter-based selection.
Overview
Semantic search uses vector embeddings to find products based on meaning rather than exact keyword matches. Instead of filtering by tags and attributes, we generate a mathematical representation of both the user's shopping intent and product characteristics, then find products with the highest similarity.
Why This Matters
Traditional filter-based product selection hits a wall at ~50 products. It requires manually curated tags, doesn't understand context, and can't handle catalog growth.
FILTER-BASED (V1-V2)
Matches exact tags like "rain" + "jacket"
Scales to: ~50 products
SEMANTIC (V3+)
Understands "waterproof outdoor gear"
Scales to: 10,000+ products
Implemented in Phase 9 of the project roadmap, semantic search is the foundation for production-scale deployment.
Architecture
Database Layer: pgvector
We use Supabase with the pgvector extension to store and query vector embeddings directly in PostgreSQL.
-- Migration: 004_add_product_embeddings.sql CREATE EXTENSION IF NOT EXISTS vector; ALTER TABLE product_catalog ADD COLUMN embedding vector(1536), -- OpenAI embedding dimension ADD COLUMN embedding_model TEXT, -- 'text-embedding-3-small' ADD COLUMN embedding_created_at TIMESTAMPTZ; -- Vector similarity index (ivfflat for 1k-10k products) CREATE INDEX idx_product_catalog_embedding ON product_catalog USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
Embedding Generation
Product embeddings are generated using OpenAI's text-embedding-3-small model. Each embedding is a 1536-dimension vector that captures the semantic meaning of the product.
// What gets embedded:
function buildEmbeddingText(product) {
return `
Name: ${product.name}
Description: ${product.description}
Category: ${product.category}
Department: ${product.department}
Tags: ${product.tags.join(', ')}
Specs: ${product.attributes}
Highlights: ${product.spec_highlights.join(', ')}
Occasions: ${product.occasions.join(', ')}
Badges: ${product.badges.join(', ')}
`;
}
// Cost: ~$0.000004 per product (very cheap!)
const embedding = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: embeddingText
});Search Function: match_products()
The database includes a custom RPC function that performs vector similarity search with filtering.
CREATE FUNCTION match_products(
query_embedding vector(1536),
match_threshold float DEFAULT 0.7,
match_count int DEFAULT 20,
filter_brand_id TEXT DEFAULT NULL,
filter_department department DEFAULT NULL,
min_price_range price_range DEFAULT NULL,
max_price_range price_range DEFAULT NULL,
min_stock INTEGER DEFAULT NULL
)
RETURNS TABLE (
id TEXT, name TEXT, description TEXT,
price DECIMAL, similarity float,
persona_fit_hunter DECIMAL, persona_fit_gatherer DECIMAL,
persona_fit_researcher DECIMAL, persona_fit_gifter DECIMAL,
-- ...and more fields
)
AS $$
SELECT *,
1 - (embedding <=> query_embedding) AS similarity
FROM product_catalog
WHERE embedding IS NOT NULL
AND (1 - (embedding <=> query_embedding)) >= match_threshold
-- Apply filters...
ORDER BY embedding <=> query_embedding
LIMIT match_count;
$$;How It Works
Build Query Context
Combine persona and shopping context into a semantic query:
Context: waterproof, indoor, cozy, warm
Weather needs: waterproof, indoor, cozy, warm
Urgency: urgent, fast shipping, in stock now"
Generate Query Embedding
OpenAI converts the query text into a 1536-dimension vector
Cost: ~$0.00001 per query
Vector Similarity Search
pgvector finds products with the most similar embeddings using cosine distance
Index: ivfflat for fast approximate search
Filter by Persona Fit
Results are filtered by minimum persona fit score (e.g., persona_fit_hunter >= 0.2)
Sort and Return
Products sorted by persona fit, with preferred brands boosted in mixed mode
Integration with Layout Generation
Semantic search is integrated into the generateLayout() function with automatic fallback to filter-based search if embeddings aren't available.
// src/lib/ai/generate-layout.ts
export async function generateLayout(
persona: Persona,
context: ShoppingContext,
options: GenerateOptions
): Promise<LayoutResponse> {
// Check if semantic search is available
const semanticAvailable = await isSemanticSearchAvailable();
let selectedProducts: Product[];
if (semanticAvailable) {
// Use semantic search with persona and context
selectedProducts = await semanticSearch(persona, context, {
limit: persona === 'goal-oriented' ? 50 : 30,
minStock: 1,
minPersonaFit: 0.2,
preferredBrands: context.preferredBrands,
brandMode: context.brandMode,
});
} else {
// Fallback to filter-based selection
selectedProducts = filterInventory(allProducts, persona, context);
}
// Continue with layout generation...
}Performance & Caching
2s
Query cache TTL
Results cached for repeated queries
$0.00001
Cost per query
OpenAI embedding generation
>2s
Slow query threshold
Logged for optimization
Query results are cached with a 2-second TTL to avoid redundant API calls for identical queries. Slow queries (over 2 seconds) are logged for performance analysis.
// Query caching implementation
const cacheKey = getSemanticSearchCacheKey(persona, context, options);
const cached = getCachedQuery<Product[]>(cacheKey);
if (cached) {
return cached; // Return cached results
}
// ... perform semantic search ...
// Cache results
cacheQueryResult(cacheKey, results, QUERY_CACHE_TTL.SEMANTIC_SEARCH);Setup & Migration
To enable semantic search in your SIX instance, you need to:
1. Run Database Migrations
Apply migration 004_add_product_embeddings.sql to add the vector column and search function
2. Generate Embeddings
Run the embedding generation script to create vectors for all products
3. Set Environment Variables
Ensure OpenAI API key is configured for embedding generation
đź“– Full migration guide: See /docs/MIGRATION_GUIDE.md for detailed setup instructions, troubleshooting, and verification steps.
Checking Availability
Use the isSemanticSearchAvailable() function to check if embeddings exist before attempting semantic search:
import { isSemanticSearchAvailable, semanticSearch } from '@/lib/inventory/rag';
// Check availability
const available = await isSemanticSearchAvailable();
if (available) {
// Use semantic search
const products = await semanticSearch(persona, context, options);
} else {
// Fallback to filter-based search
const products = filterInventory(allProducts, persona, context);
}Version Availability
Semantic search is available in the following versions:
V1 & V2
Not AvailableFilter-based product selection only
V3
AvailableEdge-native with semantic search
V4
AvailableTrend-aware with semantic search
V5
AvailableLatest version with semantic search