Building BobSmithPhotography.net has stopped expanding because I've been overwhelmed with spam on the comment forums. So, if you're here and have questions about web design come and visit Forums, Blogs, Wikis dot com. It has articles that I've written, some of which are also here, and a web design forum as well so you can ask any questions you may have. There is still a lot of good stuff here though, so poke around the links and take a look.

Home » Other Tips » Hosting Issues » robots.txt

robots.txt

There is a lot of confusion about this little text file.  Maybe you've seen requests for it in your log files and wondered what it is.  Maybe somebody told you it was a way to secure your site.  Maybe you were told that it helped prevent "bad bots" from visiting your site.  Which is it?

The robots.txt file is a way for you to tell web crawlers what they are and are not allowed to view.  For example, if you do not want your /images/ folder being indexed you can add that folder as an exclusion to your robots.txt file.   

You do not need a robots.txt file if you want web crawlers to find every part of your site.  But you can create an empty text file named robots.txt if you do not want the file not found errors in your logs.

The simplest robots.txt file is one that disallows viewing of everything by a robot.

User-agent: *
Disallow: /

This tells every web crawler that visits that they should not visit any pages.

In place of the asterick you can also use specific robot names

A couple of notes about this file.  One, it must go in the web root.  That is, it must be at www.yoursite.com/robots.txt. It cannot be www.yoursite.com/whatever/robots.txt. It will not be found and used.  Second, browsers are unaffected by this file.  Even if you use the user-agent for IE or FireFox they will still open the files that you disallow.  Third, there isn't a requirement for robots to follow these rules; it is more of a suggestion.  Many robots, mostly spammers, will ignore your robots.txt file. 

Content managed by the Etomite Content Management System.