Webmaster Knowledge Base | "Robots" | Back to overview


What is the robots.txt ? Robot or Spider calls themselves the software,
which search engines use, in order to indicate pages. But before
something is examined, the spiders looks extra into a file
written for it - which is called robots.txt . Don`t disapoint them !

The Robots.txt" file prevents only that information arrives into the indices (= database) of the search engines, which have to look for there nothing, for example: demo pages. Pages, which are not linked, do not need to become closed, because the robots cannot find them anyway.

Functions as follows:

If a Robot visits your Website, he looks first once after robots.txt and the information contained in it. The file robots.txt must be in the root directory. There may be only one per Domain. Use everytime the lower case: "robots.txt" and never "Robots.txt" or "robots.TXT".

The asterisk * is considered as wildcard character and means that the following lines apply to all robots.

With "Disallow" certain listings (files or directories ) for the robots become closed. For each listing one line is necessary.

Excluding all robots :

User agent: * Disallow

Inviting all robots :

User agent: *


Sample of an standard robots.txt :

( just create a file called "robots.txt" with an texteditor, write in the robot informations and upload it to your webservers main directory )

# my robots info ( just an comment )

user agent: * ( invite all robots )

Disallow: /cgi-bin ( block the directory "cgi-bin" with all content )
Disallow: /user ( block the directory "user" with all content )
Disallow: /missing.html ( block the page"missing.html" )