GLOSSARY

Truncation

Truncation cuts off text or tokens beyond a set length. In search, it limits query or document size for performance.

What is Truncation?

Truncation is the process of shortening text by removing characters, tokens, or fields after a limit. In search and NLP, it prevents overly long queries or documents from slowing indexing and ranking.

How It Works (quick)

  • Query truncation: Limit tokens (e.g., first 50 terms).
  • Document truncation: Limit indexed fields for performance.
  • Vector models: Cap sequence length (e.g., 512 tokens).
  • UI truncation: Shorten displayed snippets with ellipses.

Why It Matters in E-commerce

  • Performance: Prevents slowdowns from very long queries.
  • Clarity: Keeps snippets readable in SERPs.
  • Safety: Avoids abuse via “injection” queries.

Best Practices

  • Set reasonable limits per field (title vs description).
  • Log truncated queries for review.
  • Provide “read more” for truncated text in UI.
  • Test truncation lengths for SEO snippets (meta descriptions).

Summary

Truncation keeps systems fast and clean by cutting overly long text. Done well, it balances performance with readability.