GLOSSARY

Ingestion Rate

Ingestion rate is how fast new or updated records enter the index. For stores, higher, steady rates keep prices, stock, and content fresh.

What is Ingestion Rate?

Ingestion rate measures the throughput of data flowing into your search/indexing pipeline—usually records per second/minute or MB/s. It reflects how quickly connectors/crawlers, processors, and indexers can accept updates.

How It Works (quick)

  • Sources → pipeline: Connectors/webhooks/CDC or crawlers produce deltas.
  • Processing: Clean/normalize/enrich; compute embeddings/fields.
  • Indexing: Batch or streaming writes; segment creation; background merges.
  • Measurement: Track events/sec, records/sec, bytes/sec, and end-to-end lag (source timestamp → searchable).

Why It Matters in E-commerce

  • Freshness & trust: Price/stock changes must appear fast.
  • Campaign agility: New collections and content go live promptly.
  • Stability: Smooth throughput avoids backlog during peaks.

Best Practices

  • Delta-first: Prefer webhooks/CDC over polling; deduplicate by primary key.
  • Backpressure: Queue with retry/DLQ; shed load gracefully; alert on lag.
  • Parallelism: Scale workers; batch small docs; compress over the wire.
  • Idempotency: Upserts keyed by SKU/ID; tombstones for deletes.
  • SLOs: Define ingest SLOs (e.g., stock ≤5 min, price ≤5 min, content ≤1 h) and monitor.
  • Observability: Dashboards for input rate, processing time, error rate, queue depth, end-to-end latency.

Challenges

  • Vendor API throttling; bursty traffic (sales, feeds); large images/docs; embedding generation cost; reindex migrations.

Examples

  • Black Friday feed pushes 10k records/min with stable end-to-end lag under 3 minutes.
  • Price change webhook flow keeps ≤5 min freshness across locales.

Summary

Ingestion rate is a leading indicator of freshness. Build delta-first pipelines with backpressure, idempotent upserts, and clear SLOs—then monitor rate and lag so storefront data stays current.

FAQ

Ingestion rate vs indexing throughput?

Ingestion covers the whole pipeline; indexing throughput is the writer’s rate into the index.

How to raise rate safely?

Increase parallel workers, batch writes, and tune segment merges—watch tail latency and error spikes.

What if the queue grows?

Scale workers or temporarily relax enrichment (e.g., defer embeddings) to drain backlog.