homepage Welcome to WebmasterWorld Guest from 174.129.76.87
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Would crawlers crawl this?
Usiing the correct robots.txt technique
ulysee

10+ Year Member



 
Msg#: 198 posted 6:40 pm on Nov 14, 2003 (gmt 0)

Let's say I have a robots.txt file like this:

User-agent: anybot
Disallow: /redir.php

Would bots be able to follow url's like this:
[domain.com...]
or would it not follow the url above?.

 

dmorison

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 198 posted 7:31 pm on Nov 14, 2003 (gmt 0)

A robot following robots.txt to the letter would not reach "http://www.anotherdomain.com/" by virtue of requesting redir.php.

However; what the robot might do (and it is believed that Googlebot does this now) is simply add anything that looks like a valid URL to its crawl list; and so "http://www.anotherdomain.com/" would be spotted as a potential crawl target and added to the list.

If you want to make sure that the links are found then don't do what you're proposing; find another way.

ulysee

10+ Year Member



 
Msg#: 198 posted 7:51 pm on Nov 14, 2003 (gmt 0)

I want to make sure that links are not found by any crawler, any suggestions?.

dmorison

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 198 posted 3:02 pm on Nov 15, 2003 (gmt 0)

Gonna be tricky; you could do something clever with JavaScript; but again, the search engines are starting to parse JavaScript and would be able to uncover anything that you are making available as a click-able link to a human.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved