GLOSSARY

Connector

A connector moves data from your source (CMS, PIM, ERP, feeds) into the search index. In online stores, it keeps products, prices, stock, and content in sync so results stay fresh and accurate.

What is a Connector?

A connector is an integration that extracts, transforms, and loads (ETL/ELT) data from a source system into your search platform. It maps fields, handles updates/deletes, and preserves permissions (ACLs) so the index mirrors the truth in upstream systems.

How it Works (quick)

  • Extract: Pull via API, webhooks, database dumps, feeds, or crawlers.
  • Transform & map: Normalize fields (title, brand, attributes, price, availability), fix encodings, enrich with IDs.
  • Load & sync: Initial full load → incremental updates (webhooks/CDC) → reindex on schema changes.
  • Security: Carry ACLs/visibility flags; security-trim at index time.
  • Observability: Logs, retries, dead-letter queues, and metrics (lag, success rate).

Why it Matters in E-commerce

  • Freshness: Price/stock changes land in search within minutes.
  • Consistency: One schema across PDPs, collections, and autocomplete.
  • Speed to launch: Rapid marketplace/CMS integrations without custom pipelines.

Best Practices

  • Idempotent upserts; stable primary keys.
  • Change Data Capture or webhooks over blind polling.
  • Schema versioning + backfill playbooks.
  • Throttle/queue to respect source rate limits.
  • Deduplicate by SKU/GTIN; normalize variants.
  • Carry deleted flags (tombstones) to avoid ghost results.

Challenges

  • Field drift between systems, partial updates, and pagination limits.
  • API throttling, network hiccups, and large backfills.
  • Permission mismatches (leaky ACLs).

Examples

  • Shopify/BigCommerce → Search index (products, inventory, collections).
  • PIM/ERP connector → attributes, price lists, availability windows.
  • Reviews/help center connector → UGC and support content.

Summary

Connectors are the backbone of fresh, permission-aware search. With reliable deltas, clear mappings, and good observability, your index stays current and trustworthy.

FAQ

Connector vs crawler? Connector reads structured sources/APIs; a crawler fetches HTML pages.

How fast should sync be? Aim sub-5 min for price/stock; nightly for heavy content.

What about deletes? Use tombstones or hard deletes with audit logs.

Multi-locale? Carry locale fields and currency/units per market.