Forum Moderators: open

Message Too Old, No Replies

atraxbot/0.2

         

keyplyr

8:35 am on Jul 28, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Did a full site crawl. Requested robots.txt often. No abuse, but no info page either.

174.46.170.### - - [27/Jul/2009:09:57:42 -0700] "GET www.example.com/robots.txt HTTP/1.1" 200 3993 "-" "atraxbot/0.2"

twtelecom dot com in Littleton, CO

Pfui

1:00 am on Jul 29, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



(CUE TWILIGHT ZONE THEME)

Same day, less than an hour later (10:50:30 -0700) --

174-46-170-nnn.static.twtelecom.net
atraxbot/0.2
robots.txt? YES

GaryK

3:30 am on Jul 29, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Why TZ theme? I'm sure the answer will once again prove why I should be sleeping now instead of posting.

Pfui

6:59 am on Jul 29, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



With 4,294,967,296 potentially available addresses (IPv4), it seems eerily coincidental that a newly noted bot hit keyplyr's and my sites w/in 53 minutes of each other. What are the odds?

keyplyr

8:03 am on Jul 29, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Definitely a Rod Serling moment.

GaryK

2:33 pm on Jul 29, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Pfui, based on my experiences over the years, I'd have to say the odds are quite good. I have many non-contiguous IP Addresses spread across several servers from multiple hosts, and it's not at all unusual to have the same bot hit several of them within hours of each other.

GaryK

3:34 pm on Aug 2, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Same day, a few hours later:

07/27/2009 16:09:29gmt example.com 200 GET /robots.txt atraxbot/0.2

I'm still not sure this is anything more than a coincidence, but following my discussion the other day with wilderness I'll give everyone the benefit of the doubt. :)

keyplyr

8:42 am on Aug 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



dnsstuff.com says:

Reverse DNS authenticity: [Could be forged: hostname 174-46-170-186.static.twtelecom.net. does not exist]

Pfui

8:30 pm on Aug 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm not a network geek but could the lack of rDNS be ISP-related? Since atraxbot/0.2 has hit me from at least three 174-46-170-nnn.static.twtelecom.net Hosts, might those rDNS maybe-forgeries actually point to the ISP's, erm, lack of pointers?

GaryK

8:45 pm on Aug 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



When I do a nslookup on the host name I get an IP Address that doesn't appear to match the one in the host name. I'm not sure how dnsstuff decides if something is forged, but I suppose this could be one way.

keyplyr

5:38 am on Aug 9, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Well something's fishy here and since there's no bot info page or anything else to go on, I'm keeping it banned. I also added it to robots.txt and will watch to see how it behaves.

jdMorgan

3:33 pm on Aug 9, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I "know" the sites of many of the contributors to this forum, and one thing they/we all have in common is that these sites are DMOZ/ODP worthy. So if all of these sites are listed in the ODP, then that makes for "a much smaller world" in IP-address-range terms, so it's quite feasible that this agent could get from any one of our sites to any other within a few hours.

Jim

keyplyr

8:41 am on Aug 11, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just as I suspected, atraxbot came back today, read robots.txt where it is disallowed, then proceeded to crawl, where it was 403'd.

Repeated this a second time then quit.

GaryK

4:18 pm on Aug 11, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



read robots.txt where it is disallowed

Not sure what you mean by this. Explain please.

wilderness

4:34 pm on Aug 11, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



read robots.txt where it is disallowed

Not sure what you mean by this. Explain please.

User-agent: atraxbot
Disallow: /

GaryK

6:46 pm on Aug 11, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Oh OK. I misunderstood. Thought you meant it read robots.txt from some location that it wasn't supposed to read it from. Which of course wouldn't make any sense since there's only one location for it.