Create a robots.txt file for your website. Control how search engines crawl your site.
# robots.txt generated by Clarity (getclarityseo.com/robots) User-agent: * Allow: /
A robots.txt file tells search engine crawlers which pages they can and can't access on your site. It lives at yourdomain.com/robots.txt.
Disallow blocks a path from being crawled. Allow explicitly permits crawling.
Sitemap tells crawlers where your XML sitemap lives for better indexing.
⚠️ robots.txt doesn't prevent pages from being indexed if they're linked elsewhere. Use noindex meta tags for that.
Many AI companies crawl websites to train their models. You can block them by adding specific user-agent rules: