What is Word Embedding?
A word embedding is a type of vector representation where words are mapped into continuous multi-dimensional space. Words with similar meaning (king, queen) are placed close together, enabling semantic similarity in search and NLP.
How It Works (quick)
- Training: Models like Word2Vec, GloVe, or fastText learn embeddings from large corpora.
- Representation: Each word → dense vector of real numbers.
- Similarity: Compare vectors with cosine similarity.
- Use cases: Synonym detection, query expansion, semantic ranking.
Why It Matters in E-commerce
- Helps match queries like “hoodie” with products labeled “sweatshirt”.
- Improves long-tail query recall.
- Powers recommendations and cross-language search.
Best Practices
- Use domain-specific embeddings (trained on product data).
- Regularly retrain to keep vocabulary fresh.
- Combine embeddings with lexical search (hybrid).
- Monitor for biases in training data.
Summary
Word embeddings capture meaning in numbers, enabling semantic search and smarter product discovery.