Resources/Glossary/robots.txt

SEO Glossary

robots.txt

File telling crawlers which URLs they can access.

technical seo2 min readUpdated 2026-06-13

What is robots.txt?

The robots.txt file is a text file at the root of a domain (e.g., example.com/robots.txt) that gives crawlers instructions about which URLs they can and cannot access. It uses the Robots Exclusion Protocol standard.

How robots.txt works

robots.txt controls crawl access, not indexation. Disallowing a URL prevents Google from crawling it, but if the URL is linked to from other pages, Google may still index it based on those external signals — it just will not have read the page content.

Common legitimate uses of robots.txt include blocking /admin/, /staging/, /api/ endpoints, duplicate parameter-based URLs, and internal search results. A poorly configured robots.txt that blocks CSS or JavaScript files can prevent Google from rendering your pages correctly.

Example

Example

Grow With Gradient's robots.txt intentionally allows AI training bots (no Disallow for GPTBot, ClaudeBot, etc.) as part of an AEO visibility strategy. Most sites block these by default.

Frequently asked questions

Does blocking a URL in robots.txt remove it from Google?

No. It blocks crawling, not indexing. A blocked URL with external links can still be indexed with no snippet. To remove a page from results, allow crawling and apply a noindex directive.

Should I block AI crawlers in robots.txt?

It is a strategic choice. Blocking GPTBot or ClaudeBot protects content from training but removes you from AI answers. Brands competing on AI visibility increasingly allow them deliberately.

Apply this in practice

Definitions are step one.

Our team implements robots.txt correctly for clients converting paid-search budgets into organic revenue. Get a free paid-to-organic gap analysis to see where the biggest opportunities are for your site.