| Welcome to WebmasterWorld Guest from 18.104.22.168 |
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
|Become a Pro Member|
Not on my site!
Seen freshly today:
Didn't get anything from my site ... except a 403.
Looks like a potential source for abuse, especially with the 'bulk upload' feature to upload tons of pages to track.
Block block block...
Um, what IP did it crawl from? 22.214.171.124?
Sorry Bill, can't give you a definitive IP, as my logs (on a shared server) only showed the following:
followthatpage.com - - [08/Nov/2011:xx:49:59 +xx00] "GET / HTTP/1.0" 302 219 "-" "www.followthatpage.com"
followthatpage.com - - [08/Nov/2011:xx:50:00 +xx00] "GET /403.htm HTTP/1.0" 200 181 "-" "www.followthatpage.com"
Robtex gives 126.96.36.199 as the IP, 188.8.131.52/15 (XS4ALL Internet) as the range:
he says he obeys robots.txt is this true?
as i whitelist in robots.txt then it would by default be blocked, but i haven't seen this bot yet, so i don't know if it does.
|whitelist in robots.txt then it would by default be blocked |
I hope that's not your only line of defense (...because robots.txt blocks only those bots that read AND heed it -- a minority on my sites nowadays).
|he says he obeys robots.txt is this true? |
It didn't request robots.txt.
It's not actually a crawler, technically only crawlers need to obey robots.txt
This is more of a link checker type of thing, one page requested, one page checked, not a crawl.
>>I hope that's not your only line of defense
absolutely not, however some bots as you know obey it, so i consider it worth using.
>>It's not actually a crawler, technically only crawlers need to obey robots.txt
he says on his site that he obeys robots.txt, i was just wondering if he actually did.
All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved