homepage Welcome to WebmasterWorld Guest from 54.227.41.242
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
another one for the profilers
lucy24




msg:4617565
 1:57 am on Oct 18, 2013 (gmt 0)

This is a "just wondering..." question. The botnet involved has always been blocked; I only found it in logs while looking for something else. Looks like it's been visiting sporadically since August or so.

Look at this pattern:
174.37.87.58 - - [17/Oct/2013:02:43:52 -0700] "GET /dir-one/dir-two/admin/categories.php/login.php HTTP/1.1" 403 2963 "-" "Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)"
174.37.87.58 - - [17/Oct/2013:02:43:53 -0700] "GET /dir-one/dir-two/admin/file_manager.php/login.php HTTP/1.1" 403 {et cetera}
174.37.87.58 - - [17/Oct/2013:02:43:53 -0700] "GET /dir-one/dir-two/admin/banner_manager.php/login.php HTTP/1.1" 403 {et cetera}
174.37.87.58 - - [17/Oct/2013:02:43:53 -0700] "GET /admin/categories.php/login.php HTTP/1.1" 403 {et cetera}
174.37.87.58 - - [17/Oct/2013:02:43:54 -0700] "GET /admin/file_manager.php/login.php HTTP/1.1" 403 {et cetera}
174.37.87.58 - - [17/Oct/2013:02:43:54 -0700] "GET /admin/banner_manager.php/login.php HTTP/1.1" 403 {et cetera}


Pause for a moment of hilarity at the UA. It takes a very special kind of robot to think that masquerading as a Chinese search engine will increase its chances of getting in the door. (The IP is Softlayer, so this particular request would have been blocked at least two ways.)

The first set of three have been there all along; the second set seems to have been added last month, coincidentally after I started tracking. At least I hope it's coincidence ;) The filenames initially scared me out of my wits because-- pay close attention now-- /dir-one/dir-two/ in real life is a page that talks about an outside site, dir-two dot com. And, while I don't happen to have pages called
/admin/categories.php
/admin/file_manager.php
/admin/banner_manager.php
they are completely plausible filenames for dir-two dot com. Except for the .php extension, which I belatedly remembered the site doesn't use; it's all .jsp.

QUESTION: Does this set of three named files point to some particular CMS that conventionally uses these names? Just curious.

 

thetrasher




msg:4617714
 5:36 pm on Oct 18, 2013 (gmt 0)

osCommerce

174.37.87.nn - - [15/Oct/2013:04:24:59 -0700] "GET /admin/categories.php/login.php HTTP/1.1" 404 89 "-" "Mozilla/4.0 (compatible; MSIE 7.0b; Win32)"
174.37.87.nn - - [15/Oct/2013:04:24:59 -0700] "GET /logs/admin/categories.php/login.php HTTP/1.1" 404 89 "-" "Mozilla/4.0 (compatible; MSIE 7.0b; Win32)"
174.37.87.nn - - [15/Oct/2013:04:25:00 -0700] "GET /admin/banner_manager.php/login.php HTTP/1.1" 404 89 "-" "Mozilla/4.0 (compatible; MSIE 7.0b; Win32)"
174.37.87.nn - - [15/Oct/2013:04:25:07 -0700] "GET /admin/categories.php/login.php HTTP/1.1" 404 89 "-" "Mozilla/5.0 (X11; U; Linux i686; cs-CZ; rv:1.7.12) Gecko/20050929"
174.37.87.nn - - [15/Oct/2013:04:25:08 -0700] "GET /admin/file_manager.php/login.php HTTP/1.1" 404 89 "-" "Mozilla/5.0 (X11; U; Linux i686; cs-CZ; rv:1.7.12) Gecko/20050929"
174.37.87.nn - - [15/Oct/2013:04:42:39 -0700] "GET /admin/banner_manager.php/login.php HTTP/1.1" 404 89 "-" "Mozilla/5.0 (X11; U; Linux i686; cs-CZ; rv:1.7.12) Gecko/20050929"
174.37.87.nn - - [15/Oct/2013:04:42:39 -0700] "GET /admin/categories.php/login.php HTTP/1.1" 404 89 "-" "Mozilla/5.0 (X11; U; Linux i686; cs-CZ; rv:1.7.12) Gecko/20050929"
174.37.87.nn - - [15/Oct/2013:04:42:39 -0700] "GET /admin/file_manager.php/login.php HTTP/1.1" 404 89 "-" "Mozilla/5.0 (X11; U; Linux i686; cs-CZ; rv:1.7.12) Gecko/20050929"
174.37.87.nn - - [15/Oct/2013:04:42:39 -0700] "GET /logs/admin/banner_manager.php/login.php HTTP/1.1" 404 89 "-" "Mozilla/5.0 (X11; U; Linux i686; cs-CZ; rv:1.7.12) Gecko/20050929"
174.37.87.nn - - [15/Oct/2013:04:42:39 -0700] "GET /logs/admin/categories.php/login.php HTTP/1.1" 404 89 "-" "Mozilla/5.0 (X11; U; Linux i686; cs-CZ; rv:1.7.12) Gecko/20050929"
174.37.87.nn - - [15/Oct/2013:04:42:39 -0700] "GET /logs/admin/file_manager.php/login.php HTTP/1.1" 404 89 "-" "Mozilla/5.0 (X11; U; Linux i686; cs-CZ; rv:1.7.12) Gecko/20050929"

lucy24




msg:4617766
 9:36 pm on Oct 18, 2013 (gmt 0)

Oh, lordy, it's even the same IP :) Czech-speaking Linux user, Chinese search engine, same difference.

btw, it's
174.36.0.0/15
There's a pretty recent list in the Server Farms thread.

Do the real-life pages use that wonky
blahblah.php/login.php
naming format, or is that just the robot being weird?

wilderness




msg:4617782
 11:13 pm on Oct 18, 2013 (gmt 0)

FWIW

RewriteCond %{REMOTE_ADDR} ^174\.(3[67]|12[0-3]|13[23]|142)\. [OR]

lucy24




msg:4617815
 3:09 am on Oct 19, 2013 (gmt 0)

There's a time and a place for Regular Expressions ;) And, for that matter, for mod_rewrite. I've currently got

:: shuffling papers ::

Deny from 174.34.128.0/18 174.34.224.0/19 174.36.0.0/15 174.120.0.0/14 174.132.0.0/15 174.139

:: detour to look up ::

I've got 174.143 flagged as Rackspace, so you could easily go 14[23]. Or, ahem, 174.142.0.0/15. The 174.34. neighborhood has a human ISP tucked in between two server farms so I can't just go 128-blahblah/17. I track server farms but don't generally lock them out until they become offensive. I don't have anything scrapeworthy, so why put the server to the extra work.

This particular botnet is pre-blocked because anyone who asks for anything in php-- except a handful of named pages-- gets an automatic 403 rather than defaulting to 404. It's the principle of the thing.

:: glancing at incoming mail and doing a double-take as I notice that son's iPhone has me down as "Mom Lastname" ::

jmccormac




msg:4617831
 6:52 am on Oct 19, 2013 (gmt 0)

It is a subnet (174.37.87.56 - 174.37.87.63 )for New Legendmedia according to the whois but is on a Softlayer /18. From the requested URLs it looks like a vulnerabilities probe.

Regards...jmcc

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved