homepage Welcome to WebmasterWorld Guest from 54.196.24.103
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Website
Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
WebNX Hosting Source of PHP Probe/Hack Attempts
keyplyr




msg:4447637
 7:07 pm on Apr 30, 2012 (gmt 0)



UA: Mozilla/4.0 (compatible; MSIE 7.0b; Windows NT 5.1)

Probe/hack attempts to gain access to php - all blocked.


WebNX
colo and dedi hosting
50.115.32.0 - 50.115.47.255
50.115.32.0/20

A couple of mentions in previous threads:

[webmasterworld.com...]

[webmasterworld.com...]

 

dstiles




msg:4458430
 6:17 pm on May 27, 2012 (gmt 0)

I got hit by another webnx IP today, bringing my total set of ranges to...

50.115.32.0 - 50.115.47.255
67.220.192.0 - 67.220.223.255
108.171.192.0 - 108.171.223.255
173.231.0.0 - 173.231.63.255
216.18.192.0 - 216.18.223.255

keyplyr




msg:4458436
 6:29 pm on May 27, 2012 (gmt 0)



Thanks, I didn't have those last 3 ranges.

wilderness




msg:4458450
 7:51 pm on May 27, 2012 (gmt 0)

Here's some more
69.42.208.128 - 69.42.208.143
69.42.209.16 - 69.42.209.47

keyplyr




msg:4458471
 12:00 am on May 28, 2012 (gmt 0)


Thanks Don

Staffa




msg:4458485
 12:53 am on May 28, 2012 (gmt 0)

I have the following range as dedicated hosting

NetRange: 69.42.208.0 - 69.42.223.255
CIDR: 69.42.208.0/20
NetName: AWKNET

wilderness




msg:4458489
 1:19 am on May 28, 2012 (gmt 0)

Staffa,
Those last two lines came from a WHOIS on the NetName.]

Arin doesn't work for much of anything these days, however it still works on Netnames (WEBNX)

keyplyr




msg:4458515
 4:10 am on May 28, 2012 (gmt 0)


Good catch Staffa. I would not have known these WebNX ranges were inside AWKNET. Much easier to manage.

lucy24




msg:4492591
 4:01 am on Sep 8, 2012 (gmt 0)

:: bump ::

Never mind the rest of 'em. I want to know exactly who these people are:

173.231.28.242
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)

Here's why.

Yesterday-- just a bit over 24 hours ago-- I uploaded a cluster of very large e-books. I have just rechecked the timestamp to make sure I haven't misplaced a day or week. (Been known to happen.) First the e-books, then their shared index file, and last of all a revised /ebooks/ index file to incorporate the new links.

After some final spot-checking, I fed one file to google and a different one to bing in order to jump-start them:

66.249.68.104 - - [06/Sep/2012:19:53:37 -0700] "GET /ebooks/paston/paston2.html HTTP/1.1" 200 118260 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
and
157.55.35.99 - - [06/Sep/2012:20:20:52 -0700] "GET /ebooks/paston/paston3.html HTTP/1.1" 200 304166 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"

(There's some technicalia in the background that I don't understand. The actual filesizes are each around 900K; apparently google took a partial chomp and flagged it for later, while bing got the whole thing in some kind of gzip mode.)

Haven't heard any more from bing, but google came back an hour later for the whole file:

66.249.68.104 - - [06/Sep/2012:20:46:18 -0700] "GET /ebooks/paston/paston2.html HTTP/1.1" 200 318875 "-" <etc.>

and then, after another hour (numbering is wonky, so they got everything that's available),

66.249.68.104 - - [06/Sep/2012:21:37:35 -0700] "GET /ebooks/paston/ HTTP/1.1" 200 4664 "-" <etc.>
66.249.68.104 - - [06/Sep/2012:21:43:54 -0700] "GET /ebooks/paston/paston3.html HTTP/1.1" 200 304166 "-" <etc.>
66.249.68.104 - - [06/Sep/2012:21:48:41 -0700] "GET /ebooks/paston/sample.pdf HTTP/1.1" 200 223867 "-" <etc.>
66.249.68.104 - - [06/Sep/2012:21:51:14 -0700] "GET /ebooks/paston/paston4.html HTTP/1.1" 200 298260 "-" <etc.>

And the point is...

173.231.28.242 - - [07/Sep/2012:08:23:09 -0700] "GET /ebooks/paston/paston3.html HTTP/1.1" 200 304110 "http://www.example.com/ebooks/paston/paston3.html" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)"

wtf?

As of this instant, g has indexed vol. 2, the cover page and (!) the pdf, while b has indexed vol. 3 (the page I fed them). Of course I have no idea whether they'd already done so by 8AM.

If it weren't for the auto-referer, I would assume I'd stumbled across yet another plainclothes bingbot.

dstiles




msg:4492737
 6:52 pm on Sep 8, 2012 (gmt 0)

I'd be inclined to block such a simple UA. It's obviously fake.

173.231.28.242 has rDNS of gridlesshosting.com registered 2009 by Cameron Brunner using a gmail address (which is immediately suspicious). Address is Queensland, Australia according to DNS.

Check out DNS for domain and IP. Also check intodns.com for full setup incl. A and MX records.

Use the "proxy" view on ixquick to view the web page - it seems to be a support service for hosting customers.

Robtex shows the domain linkjuicefactory dot com on the same IP - now I wonder... :)

For info: I blocked the IP range 173.231.0.0 - 173.231.63.255 in Apr 2011.

lucy24




msg:4492903
 8:12 am on Sep 9, 2012 (gmt 0)

Yah, but you can't run around blocking every /18 you meet. You'd never see the end of it :)

Anyway, I am more concerned with how the phhzzt this previously unknown robot homed right in on a brand spankin new file. Unless bing told them.

dstiles




msg:4493080
 9:12 pm on Sep 9, 2012 (gmt 0)

I block a LOT of /24 to /16 and all increments inbetween. It's either that or block "real" users within larger ranges.

Could there have been a "page update" checker in operation, either within that IP range or on a DSL range used by the server owner? Since there seems to be a "link juice factory" on the IP I would say that's a possibility, perhaps checking for their customers?

I would not expect bing or any other reputable SE to tell-tale directly, although the offender may be periodically scraping SEs for such things.

Timing may be coincidental? Perhaps, say, a 4-hour cycle might do the job and a page-checker visit (or SE scrape) coincided with your update?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved