A crawler discovers and fetches pages so they can be indexed. For stores, it finds categories and products, respects robots rules, and helps keep search results up to date.
A crawler (spider/robot) is software that discovers URLs, fetches their content, and passes data to an indexer. Crawlers can be public (e.g., search engines) or private (your site crawler for internal search).
robots.txt
, nofollow
, crawl-delay, and allow/deny patterns.Crawlers keep your index complete and fresh. With clean linking, sitemaps, and rendering strategies, they discover the right pages without blowing up crawl budget.
Crawler vs connector? Crawler reads pages; connector reads structured sources/APIs. Many stacks use both.
Do I need JS rendering? If critical content loads via JS—yes. Prefer SSR/prerender for reliability.
How to handle filters? Canonicals for core combinations; block noisy params; keep one canonical URL per theme.