I got hit by another webnx IP today, bringing my total set of ranges to...
220.127.116.11 - 18.104.22.168
22.214.171.124 - 126.96.36.199
188.8.131.52 - 184.108.40.206
220.127.116.11 - 18.104.22.168
22.214.171.124 - 126.96.36.199
Thanks, I didn't have those last 3 ranges.
Here's some more
188.8.131.52 - 184.108.40.206
220.127.116.11 - 18.104.22.168
I have the following range as dedicated hosting
NetRange: 22.214.171.124 - 126.96.36.199
Those last two lines came from a WHOIS on the NetName.]
Arin doesn't work for much of anything these days, however it still works on Netnames (WEBNX)
Good catch Staffa. I would not have known these WebNX ranges were inside AWKNET. Much easier to manage.
:: bump ::
Never mind the rest of 'em. I want to know exactly who these people are:
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)
Yesterday-- just a bit over 24 hours ago-- I uploaded a cluster of very large e-books. I have just rechecked the timestamp to make sure I haven't misplaced a day or week. (Been known to happen.) First the e-books, then their shared index file, and last of all a revised /ebooks/ index file to incorporate the new links.
After some final spot-checking, I fed one file to google and a different one to bing in order to jump-start them:
188.8.131.52 - - [06/Sep/2012:19:53:37 -0700] "GET /ebooks/paston/paston2.html HTTP/1.1" 200 118260 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
184.108.40.206 - - [06/Sep/2012:20:20:52 -0700] "GET /ebooks/paston/paston3.html HTTP/1.1" 200 304166 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
(There's some technicalia in the background that I don't understand. The actual filesizes are each around 900K; apparently google took a partial chomp and flagged it for later, while bing got the whole thing in some kind of gzip mode.)
Haven't heard any more from bing, but google came back an hour later for the whole file:
220.127.116.11 - - [06/Sep/2012:20:46:18 -0700] "GET /ebooks/paston/paston2.html HTTP/1.1" 200 318875 "-" <etc.>
and then, after another hour (numbering is wonky, so they got everything that's available),
18.104.22.168 - - [06/Sep/2012:21:37:35 -0700] "GET /ebooks/paston/ HTTP/1.1" 200 4664 "-" <etc.>
22.214.171.124 - - [06/Sep/2012:21:43:54 -0700] "GET /ebooks/paston/paston3.html HTTP/1.1" 200 304166 "-" <etc.>
126.96.36.199 - - [06/Sep/2012:21:48:41 -0700] "GET /ebooks/paston/sample.pdf HTTP/1.1" 200 223867 "-" <etc.>
188.8.131.52 - - [06/Sep/2012:21:51:14 -0700] "GET /ebooks/paston/paston4.html HTTP/1.1" 200 298260 "-" <etc.>
And the point is...
184.108.40.206 - - [07/Sep/2012:08:23:09 -0700] "GET /ebooks/paston/paston3.html HTTP/1.1" 200 304110 "http://www.example.com/ebooks/paston/paston3.html" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)"
As of this instant, g has indexed vol. 2, the cover page and (!) the pdf, while b has indexed vol. 3 (the page I fed them). Of course I have no idea whether they'd already done so by 8AM.
If it weren't for the auto-referer, I would assume I'd stumbled across yet another plainclothes bingbot.
I'd be inclined to block such a simple UA. It's obviously fake.
220.127.116.11 has rDNS of gridlesshosting.com registered 2009 by Cameron Brunner using a gmail address (which is immediately suspicious). Address is Queensland, Australia according to DNS.
Check out DNS for domain and IP. Also check intodns.com for full setup incl. A and MX records.
Use the "proxy" view on ixquick to view the web page - it seems to be a support service for hosting customers.
Robtex shows the domain linkjuicefactory dot com on the same IP - now I wonder... :)
For info: I blocked the IP range 18.104.22.168 - 22.214.171.124 in Apr 2011.
Yah, but you can't run around blocking every /18 you meet. You'd never see the end of it :)
Anyway, I am more concerned with how the phhzzt this previously unknown robot homed right in on a brand spankin new file. Unless bing told them.
I block a LOT of /24 to /16 and all increments inbetween. It's either that or block "real" users within larger ranges.
Could there have been a "page update" checker in operation, either within that IP range or on a DSL range used by the server owner? Since there seems to be a "link juice factory" on the IP I would say that's a possibility, perhaps checking for their customers?
I would not expect bing or any other reputable SE to tell-tale directly, although the offender may be periodically scraping SEs for such things.
Timing may be coincidental? Perhaps, say, a 4-hour cycle might do the job and a page-checker visit (or SE scrape) coincided with your update?