Robots (also known as Crawlers, or Spiders), are programs that traverse the Web automatically. Search engines such as Google use them to index the web content, spammers use them to scan for email addresses, and they have many other uses. 

To create a robot.txt file, open notepad type in your instruction then save as robot.txt. Upload it in your main directory.

See below an example of instruction:

To exclude cgi-bin, tmp and ~joe directories :

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /~joe/

 

To excluide the entire server :User-agent: *
Disallow: /
You can also use html meta tags to tell robots what to do:

<META NAME=”ROBOTS” CONTENT=”NOINDEX, FOLLOW”> (mean not to index the content of the page, but come back again and check for updates)


<META NAME=”ROBOTS” CONTENT=”INDEX, NOFOLLOW”> (mean index the content of the page and dont come back for updates)
<META NAME=”ROBOTS” CONTENT=”NOINDEX, NOFOLLOW”> (mean not to index the content of the page and dont come back for updates)
Hope this helps.