GLOSSARY

Search Database

A search database stores and serves indexes built for fast text and filter queries. In stores, it powers millisecond search over big catalogs with facets and sorting.

What is a Search Database?

A search database is a datastore optimized for information retrieval rather than OLTP. It holds inverted indexes for text, doc values/columnar stores for filters/sorts, and often vector indexes for semantic retrieval—plus replicas and caches for low-latency reads.

How It Works (quick)

  • Ingest: Connectors/feeds → clean & normalize → map to fields.
  • Index structures:
    • Inverted index for tokens/phrases/positions.
    • Doc values for facets, ranges, and sorting (price, stock, rating).
    • ANN/vector index for embeddings.
  • Serve: Query planner selects structures → retrieve candidates → apply filters/ACL → score/re-rank → return hits.
  • Scale: Shard for throughput; replicate for HA; snapshot for backups.

Why It Matters in E-commerce

  • Speed: Sub-100 ms queries for large catalogs.
  • Quality: Fielded data enables strong ranking and facets.
  • Flexibility: Hybrid (BM25 + vectors) and merchandising rules without heavy joins.

Best Practices

  • Schema hygiene: Separate exact (SKU/MPN), text (title/desc), attributes (typed), vectors.
  • Locale analyzers: Per-market tokenization/lemmatization and stopwords.
  • Freshness: Event-driven deltas for price/stock; monitor indexing lag.
  • Ops: Right-size shards; tune merge/segment policies; autoscale read replicas.
  • Observability: Dashboards for QPS, p95/99 latency, index size, merge backlog, cache hit rate.
  • Resilience: Snapshots, restore drills, and versioned mappings.

Challenges

  • Hot shards, mapping drift, segment bloat, uneven traffic, and embedding staleness.

Examples

  • Product grid with price/rating sorting and in-stock filter under 100 ms.
  • Hybrid retrieval: BM25 recall + vector recall → LTR re-rank for long queries.

Summary

A search database is the backbone of fast, relevant product discovery. Design clean fields, use locale analyzers, keep deltas flowing, and monitor latency and index health.

FAQ

Search DB vs relational DB?

Relational DBs excel at transactions; search DBs are purpose-built for retrieval & ranking.

Do I need vectors?

Use vectors where recall gaps exist—keep lexical for precision and filters.