Welcome to WebmasterWorld Guest from 54.146.217.179

Message Too Old, No Replies

Google & Robots.txt

Basic Robots.txt question

     
10:42 pm on Jun 24, 2005 (gmt 0)

New User

10+ Year Member

joined:Feb 1, 2005
posts:24
votes: 0


In your robots .txt file can you specify certain directories that you do not want the spider searching? Also if someone has linked to a page in a directory that you have specified the spider not to see, will the spider check for the robots.txt first or will it follow the link and try to index the page anyway?

Thanks

Rob.

4:17 am on June 25, 2005 (gmt 0)

New User

10+ Year Member

joined:Feb 1, 2005
posts:24
votes: 0


Any one?
9:29 am on June 25, 2005 (gmt 0)

Full Member

joined:Jan 12, 2004
posts:334
votes: 0


This is covered in many other posts in the robots.txt forum, and you should post this in the correct forum, which is the robots.txt forum. The "My threads" area under our control panel area here isn't working correctly. I posted this info before and recently, but the thread isn't showing up! (what's up with that mods?) It was in the robots.txt forum.

Ok after some searching I finally found it, message #2.
[webmasterworld.com...]

I can't answer your second question.

10:42 am on June 25, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 3, 2003
posts:1633
votes: 0


Also if someone has linked to a page in a directory that you have specified the spider not to see, will the spider check for the robots.txt first or will it follow the link and try to index the page anyway?

Any self-respecting robot will obey robots.txt regardless of how the link was found. Googlebot certainly does, and i'm sure every other mainstream search engine crawler does too. There would be little point in robots.txt if this were not the case.