Welcome to WebmasterWorld Guest from 54.167.29.212

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

followthatpage

Not on my site!

     

Mokita

11:08 am on Nov 8, 2011 (gmt 0)

5+ Year Member



Seen freshly today:

Agent: www.followthatpage.com
Host: followthatpage.com

Didn't get anything from my site ... except a 403.

incrediBILL

11:20 am on Nov 8, 2011 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Looks like a potential source for abuse, especially with the 'bulk upload' feature to upload tons of pages to track.

Block block block...

Um, what IP did it crawl from? 82.161.140.128?

Mokita

11:40 am on Nov 8, 2011 (gmt 0)

5+ Year Member



Sorry Bill, can't give you a definitive IP, as my logs (on a shared server) only showed the following:

followthatpage.com - - [08/Nov/2011:xx:49:59 +xx00] "GET / HTTP/1.0" 302 219 "-" "www.followthatpage.com"
followthatpage.com - - [08/Nov/2011:xx:50:00 +xx00] "GET /403.htm HTTP/1.0" 200 181 "-" "www.followthatpage.com"

Robtex gives 80.126.0.111 as the IP, 80.126.0.0/15 (XS4ALL Internet) as the range:

[robtex.com...]

topr8

1:21 pm on Nov 8, 2011 (gmt 0)

WebmasterWorld Senior Member topr8 is a WebmasterWorld Top Contributor of All Time 10+ Year Member



he says he obeys robots.txt is this true?

as i whitelist in robots.txt then it would by default be blocked, but i haven't seen this bot yet, so i don't know if it does.

Pfui

2:21 pm on Nov 8, 2011 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



whitelist in robots.txt then it would by default be blocked


I hope that's not your only line of defense (...because robots.txt blocks only those bots that read AND heed it -- a minority on my sites nowadays).

Mokita

7:35 pm on Nov 8, 2011 (gmt 0)

5+ Year Member



he says he obeys robots.txt is this true?


It didn't request robots.txt.

incrediBILL

8:18 pm on Nov 8, 2011 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



It's not actually a crawler, technically only crawlers need to obey robots.txt

This is more of a link checker type of thing, one page requested, one page checked, not a crawl.

topr8

7:27 am on Nov 9, 2011 (gmt 0)

WebmasterWorld Senior Member topr8 is a WebmasterWorld Top Contributor of All Time 10+ Year Member



>>I hope that's not your only line of defense

absolutely not, however some bots as you know obey it, so i consider it worth using.

>>It's not actually a crawler, technically only crawlers need to obey robots.txt

he says on his site that he obeys robots.txt, i was just wondering if he actually did.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month