GLOSSARY

Controlled Vocabulary

A controlled vocabulary is a managed list of allowed terms and their synonyms. In stores, it keeps product attributes consistent, so filters, search, and analytics work reliably.

What is a Controlled Vocabulary?

A controlled vocabulary (CV) is a curated set of preferred terms with synonyms, IDs, and definitions used to label content and products consistently. It underpins taxonomies, facets, schema markup, and analytics, reducing duplicates and confusion.

How It Works (quick)

  • Structure: Preferred label, alternate labels (synonyms), stable ID, description, and relationships (broader/narrower/related).
  • Localization: Per-locale labels; protect brand casing/accents.
  • Governance: Change requests, approvals, deprecations, and mappings (old → new).
  • Integration: Map inbound data to CV terms; validate at import; write canonical tags to the index.

Why it Matters in E-commerce

  • Consistent filters & SEO: Uniform attributes (brand, material, fit) power facets and clean URLs.
  • Better recall: Synonyms map to the same canonical term (“tee shirt” → “T-shirt”).
  • Reliable analytics: Aggregations aren’t split across near-duplicates.
  • Faster ops: Less back-and-forth on naming and spelling.

Best Practices

  • Assign stable IDs (don’t key on labels).
  • Keep a synonym table (+ locale variations).
  • Separate taxonomy (hierarchy) from vocabulary (terms); link them.
  • Maintain a request log and deprecation map; publish release notes.
  • Expose a lookup API or mapping service to all pipelines.

Challenges

  • Drift & sprawl: Vendors invent new spellings; brands change lines.
  • Granularity fights: Too fine = bloat; too coarse = weak filters.
  • Multilingual pitfalls: One-to-many mappings; false friends.

Examples

  • Normalize “gore tex”, “GORETEX”GORE-TEX (brand).
  • Map “sneakers”, “trainers”Sneakers (category term).
  • Attribute “merino wool”Merino (material) with Wool as broader class.

Summary

A controlled vocabulary is the backbone of consistent labeling. Govern it carefully, localize it, and integrate it into every data pipeline to improve filters, SEO, and reporting.

FAQ

CV vs taxonomy vs ontology?

CV = term list with synonyms; taxonomy = hierarchical organization; ontology = typed relations and rules.

Who owns the vocabulary?

Usually merchandising/content ops with input from SEO and engineering; treat it like a product with versioning.

How often to update?

On new brands/lines, seasonal terms, and when analytics show fragmentation.