Create correct syntax to control search engine crawlers.

CalcVerse

Robots.txt Directive Generator

How it Works

The tool assembles the directives based on your inputs. Format: 'User-agent: [agent]' followed by 'Disallow: [path]' or 'Allow: [path]'. It ensures proper syntax to avoid accidental de-indexing.

What is Robots.txt Directive Generator?

Robots.txt is a text file webmasters create to instruct web robots (typically search engine robots) how to crawl pages on their website. It defines which parts of the site should be accessed and which should be ignored.

Step-by-Step Guide

  • Select Agent – Use '*' for all bots or specify 'Googlebot'.
  • Define Rules – Input paths to disallow (e.g., /admin/).
  • Add Sitemap – Include the link to your XML sitemap.
  • Generate – Copy the formatted text.

Example

Input: Block /private/

Result: User-agent: * Disallow: /private/

FAQ

Does Disallow hide pages?

Not necessarily. It prevents crawling, but pages can still be indexed if linked elsewhere. Use 'noindex' meta tags for removal.

What does User-agent: * mean?

It applies the rules to ALL web crawlers.

Is it case sensitive?

Yes, paths are case-sensitive. /Admin/ is different from /admin/.

How to allow everything?

User-agent: * Disallow:

Can I block bad bots?

Yes, you can specifically name bad bots and disallow them, though sophisticated bots may ignore this.

Conclusion

A well-configured robots.txt conserves your crawl budget and protects private directories. However, a single typo (like 'Disallow: /') can de-index your entire site. Always validate before uploading.

Explore Related Calculators

References & Standards

This calculator uses formulas and data standards from Standard References to ensure accuracy.

Interactive Calculator Loading...