GLOSSARY

Query Terms

Query terms are the individual words and tokens extracted from what a user types. Stores use clean query terms—after normalization and stopword handling—to match the right products fast.

What are Query Terms?

Query terms are the tokens produced from a user’s input after analysis (lowercasing, accent folding, tokenization). They include single words, numbers, symbols, and sometimes phrases/bigrams or entities (e.g., “air”, “max”, “270”, “GORE-TEX”, “size 45”).

How It Works (quick)

  • Analyze: Detect language → tokenize → normalize case/diacritics → optional lemmatization.
  • Filter: Remove/soften stopwords, handle punctuation/hyphens, protect brand/SKU casing.
  • Augment: Create bigrams/phrases, extract entities (size, brand, material), and mark operators (≤, “under”).
  • Map to fields: Send terms to appropriate fields: title, attributes, description, exact (SKU/MPN).
  • Score: Combine with BM25/TF-IDF, field boosts, phrase/proximity logic, then re-rank.

Why It Matters in E-commerce

  • Precision & recall: Clean terms prevent noise and catch variants.
  • Speed: Fewer, better tokens → faster lookups in the inverted index.
  • Explainability: Highlights/snippets show which terms matched and where.

Best Practices

  • Locale-aware analyzers per market; keep exact keyword fields for SKU/brand.
  • Normalize hyphen/space forms (e.g., gore-texgore tex).
  • Maintain stopword lists per language; don’t drop meaningful words (“pro”, “max”).
  • Generate bigrams for frequent units (e.g., “air max”, “trail running”).
  • Log parsed terms and field hits for debugging and A/Bs.

Challenges

  • Short queries (“nike”), code-switching, ambiguous tokens (“apple”), trademark/brand casing, and noisy punctuation.

Examples

  • Input: “men’s gore tex trail shoes 45”
  • → terms: men, gore, tex, trail, shoes, 45 + phrase: gore tex + entity: size=45.
  • Input: “AB-1234”
  • → routed to exact SKU field without fuzziness.

Summary

Query terms are the analyzed building blocks of retrieval. Get tokenization, normalization, and field mapping right to keep results fast, precise, and explainable.

FAQ

Query terms vs keywords?

“Keywords” are SEO/strategy targets; query terms are the analyzed tokens used at retrieval time.

Should I always remove stopwords?

Usually soften, not remove—some can matter in product names.

Do vectors replace terms?

No—keep lexical terms for precision, filters, and speed; vectors add meaning.