Forum Moderators: goodroi
I am a newbie to Robots.txt files and would like opinions on my file:
# All robots will spider the domain
User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
Disallow: leftpage.html
Disallow: header.html
Disallow: footer.html
I have just recently moved from mostly Frame webpages to mostly non-frame web pages.
Your comments are appreciated.
Regards,
Tom
[edited by: Woz at 1:04 am (utc) on Oct. 2, 2005]
[edit reason] No URLs please, see Tos#13 [/edit]
Sorry about the URL - it was an honest mistake - also, it looks like the eraser has already taken it out.
#1 So... my R.txt file should look like this?:
User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
#2 - can you elaborate further on:
Every single URL has to start with / otherwise you should not count on it being matched.
TIA,
Tom
Every single URL has to start with / otherwise you should not count on it being matched.
URLs in Disallow statements should start with / because robots.txt standard requires trying to check if actual URL starts with that value -- since all urls will start with /, it means that if you have not got it specified there then it won't be matched and thus won't be disallowed, and technically it will be all your fault.