GLOSSARY

Neural Re-ranking (Cross-encoder)

Neural re-ranking rescoring top candidates with a deep model greatly improves ordering. It reads the query and document together for fine-grained relevance.

What is Neural Re-ranking?

Neural re-ranking uses a cross-encoder to jointly encode query and candidate text, producing high-quality relevance scores for the short list returned by recall.

How It Works (quick)

  • Pipeline: Recall (BM25/vectors) → top-k → cross-encoder scores → order.
  • Features: Handles phrases, context, negation better than bi-encoders.
  • Latency: Apply to small k (e.g., 50–200).

Why It Matters in E-commerce

  • Lifts NDCG/CTR noticeably on ambiguous queries.
  • Improves snippets/highlighting consistency.

Best Practices

  • Distill to lighter models for speed.
  • Use category-specific variants if needed.
  • Log explanations/attentions cautiously.

Summary

Cross-encoders fix ordering on hard cases. Keep k small and models optimized.