GLOSSARY

Relevance Score

A relevance score is the numeric value used to order results. It blends text signals with business features so the best items surface first.

What is a Relevance Score?

A relevance score is a composite metric assigned to each candidate result. It may come from BM25/field weights, a learning-to-rank model (e.g., LambdaMART), or a neural re-ranker, often normalized to a common scale (e.g., 0–1 or z-scores).

How It Works (quick)

  • Inputs: Textual evidence (exact/phrase/proximity), semantic similarity, attribute matches, and business features (price, stock, rating, margin, recency, size-in-stock).
  • Modeling: Linear blend or ML model; calibration to stabilize scores across categories/locales.
  • Normalization: Min-max, logistic, or percentile scaling for consistent thresholds.
  • Post-rules: Diversity/brand caps, tie-breakers (price, rating), and compliance filters.
  • Explainability: Log top contributing features for “why this result.”

Why It Matters in E-commerce

  • Consistent ordering across large catalogs and seasons.
  • Operational control: Thresholds for no-show/“did you mean”, safe boosts for campaigns.
  • Diagnostics: Score distributions reveal regressions or drift.

Best Practices

  • Keep hard filters (ACL, region, OOS) outside the score—apply before ranking.
  • Per-category calibration and feature caps to prevent domination by popularity.
  • Use golden sets + A/B tests; monitor NDCG/CTR/conv and tail latency.
  • Store score, features, and version for audits and rollback.
  • Guard SKUs/brands with exact fields so scores reflect real intent.

Challenges

  • Data leakage, position bias in labels, score drift across locales, and explainability for deep models.

Examples

  • Score = 0.62 after BM25 + phrase boost + in-stock + rating feature; re-ranked above a 0.58 item with weaker availability.
  • Campaign adds a +0.05 capped boost to “new arrivals” only within footwear.

Summary

A relevance score is your ordering signal. Build it from hybrid evidence, calibrate per category, enforce hard rules outside the score, and log contributions for trust and tuning.

FAQ

Relevance score vs ranking score?

Often the same; “relevance score” emphasizes fit, while ranking may include business heuristics.

Global scale across categories?

Calibrate per category/locale; don’t compare raw scores blindly.