« France's Exalead Continues Growing, Database Passes 4 Billion Page Mark | Main | Google Sitemaps Stats On Most Common Words In Your Anchor Text & Site Content »
February 6, 2006
Google Launches Robots.txt File Checker; Now We Need Robots.txt Standardization
Very nice. Wondering how a search engine will process your robots.txt file? Google now provides a way to check on that through the Google Sitemaps program. More stats and analysis of robots.txt files from the official Inside Google Sitemaps blog explains more.
For Search Engine Watch members, the longer version of this article gives a real life example of how nice the checker is in action.
Overall, I'm thrilled with the new tool. I'd like to see the other search engines add similar ones. Even better, I'd like to see them all come together on creating an enhanced and more standardized robots.txt standard. Consider:
- Google
allows
wildcards, but others don't.
- Ask,
MSN &
Yahoo allow crawl delays (but don't define minimum or maximum values).
Google does not.
- Ask & Google have ALLOW commands that no others support
Postscript: Matt Cutts from Google has some good comments over here, pointing out Google also has an allow command (I've updated my list above) and further in comments to the post, explaining why they don't support crawl-delay yet because of concerns it might be set too low by mistake by some webmasters.
Posted by Danny Sullivan on February 6, 2006 8:08 PM











