homepage Welcome to WebmasterWorld Guest from 54.237.98.229
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Dot Exe in User Agent String
lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4608564 posted 10:05 pm on Sep 9, 2013 (gmt 0)

Quick question: Does any legitimate human UA string ever contain the element ".exe"?

Had a slightly droll visit from a Ukrainian robot* making four requests for the same file (what is it with robots and large html files anyway?). Three had assorted humanoid UAs but got blocked on other grounds. The fourth was a blatantly robotic "xpymep.exe" and it got through.

Hence the question. Can I do unanchored
\.exe
or do I have to stick with a more narrowly constrained
^\S+\.exe$
or even
^\w+\.exe$
?


* IP appears to be a mixed range including humans, but I may block it anyway.

 

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4608564 posted 10:27 pm on Sep 9, 2013 (gmt 0)

Not that I'm aware of and if it does, tough nuts to them.

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4608564 posted 11:29 pm on Sep 9, 2013 (gmt 0)

[26/Feb/2012:02:23:32 +0000] "GET /downloads/setup_akl.exe

[14/Jun/2004:18:12:14 -0700] "POST /_vti_bin/shtml.exe/_vti_rpc HTTP/1.1" 404 - "-" "MSFrontPage/4.0"

[14/Jan/2006:20:09:59 -0800] "GET / HTTP/1.1" 403 - "-" "iexplore.exe"

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4608564 posted 11:48 pm on Sep 9, 2013 (gmt 0)

But those examples were file requests, not user agents, which is the topic per the OP's first line in her post.

I would block it in any server string because it's not doing anything useful on my Linux server in the first place as there are no dot exe's

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4608564 posted 12:03 am on Sep 10, 2013 (gmt 0)

But those examples were file requests, not user agents

The third example had a strongly robotic
^\w+\.exe$
as the UA.

Requests for .exe are no skin off my nose because I have nothing with this extension so they may as well get a 404. Unless the act of checking for a file's existence consumes significantly more server resources than reading one or two more lines in htaccess? I do block requests for .php --except a few named files that really use it-- so I guess I'm not entirely consistent here :(


Edit after detour to raw logs, searching for .exe (thank you, Spotlight):

Oh, now that's interesting. From the robot-profiling POV, I mean. The most recent visit--the one that prompted the post--looked like this:

93.79.72.210 ... /ebooks/paston/paston3.html HTTP/1.0" 403 2893 "http://www.example.com/ebooks/paston/paston3.html" "Mozilla/5.0 (iPad; CPU OS 6_0_1 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A523 Safari/8536.25"
93.79.72.210 ... /ebooks/paston/paston3.html HTTP/1.1" 200 906909 "-" "xpymep.exe"
93.79.72.210 ... /ebooks/paston/paston3.html HTTP/1.0" 403 2893 "http://example.com/ebooks/paston/paston3.html" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.4 (KHTML, like Gecko) Chrome/22.0.1229.94 Safari/537.4"
93.79.72.210 ... /ebooks/paston/paston3.htmlindex.php HTTP/1.0" 403 2893 "http://www.example.com/ebooks/paston/paston3.htmlindex.php" "Mozilla/5.0 (Windows NT 6.1; rv:11.0) Gecko/20100101 Firefox/11.0"

Two lockouts for auto-referer (this is done manually in htaccess for a few very large files, mostly ebooks), the third for ".php" at the end of the request.

Now here's the previous occurrence of .exe, which I didn't pick up on at the time:

89.70.25.224 ... /ebooks/paston/paston3.html HTTP/1.1" 403 1497 "http://www.example.com/" "Mozilla/5.0 (Windows NT 6.1; rv:5.0) Gecko/20100101 Firefox/5.02"
89.70.25.224 ... /boilerplate/contact.html HTTP/1.1" 200 1838 "http://www.example.com/" "Mozilla/5.0 (Windows NT 6.1; rv:5.0) Gecko/20100101 Firefox/5.02"

89.70.25.224 ... /ebooks/paston/paston3.html HTTP/1.0" 403 2651 "http://www.example.com/ebooks/paston/paston3.html" "Opera/9.80 (Windows NT 5.1; U; en) Presto/2.10.289 Version/12.00"
89.70.25.224 ... /ebooks/paston/paston3.html HTTP/1.1" 200 906909 "-" "bpgrupy.exe"
89.70.25.224 ... /ebooks/paston/paston3.html HTTP/1.0" 403 2651 "http://example.com/ebooks/paston/paston3.html" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2"
89.70.25.224 ... /ebooks/paston/paston3.htmlindex.php HTTP/1.0" 403 2651 "http://www.example.com/ebooks/paston/paston3.htmlindex.php" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.46 Safari/535.11 MRCHROME"

Notice how it's exactly the same pattern? The first two came about an hour earlier and have a pattern of their own which I call the "contact.html botnet": a blocked request for some large inner page, followed by /contact.html.

Still earlier (this is, I think, a blocked IP):

217.195.202.9 ... /wp-admin HTTP/1.0" 403 2651 "http://www.example.com/wp-admin" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.4 (KHTML, like Gecko) Chrome/22.0.1229.79 Safari/537.4"
217.195.202.9 ... /wp-admin HTTP/1.1" 403 2702 "-" "xrumerguestbook1.exe"
217.195.202.9 ... /wp-admin HTTP/1.0" 403 2651 "http://example.com/wp-admin" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.57 Safari/536.11"
217.195.202.9 ... /wp-adminindex.php HTTP/1.0" 403 2651 "http://www.example.com/wp-adminindex.php" "Opera/9.80 (Windows NT 6.2; WOW64; MRA 8.0 (build 5784)) Presto/2.12.388 Version/12.10"

Earlier still: another of the 2+4 pattern.

Earlier still:

68.235.38.7 ... /ebooks/alida/Alida.html HTTP/1.0" 403 1442 "http://www.example.com/ebooks/alida/Alida.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90)"
68.235.38.7 ... /ebooks/alida/Alida.html HTTP/1.1" 200 715483 "-" "start.exe"
68.235.38.7 ... /ebooks/alida/Alida.html HTTP/1.0" 301 541 "http://example.com/ebooks/alida/Alida.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90)"
68.235.38.7 ... /ebooks/alida/Alida.htmlindex.php HTTP/1.0" 403 1442 "http://www.example.com/ebooks/alida/Alida.htmlindex.php" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90)"
68.235.38.7 ... /ebooks/alida/Alida.html HTTP/1.0" 403 1442 "http://www.example.com/ebooks/alida/Alida.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90)"

The 301 is due to MSIE 6 in the UA; they get redirected (not rewritten) to a custom page because it's still remotely possible I've got humans going back that far. Other than that it's the identical pattern-- and that's going back over at least a year. Infrequent but steady.

Notice the alternation between 1.0 and 1.1? I don't normally see that in a single visit. And it's the humanoid UAs that use 1.0.

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4608564 posted 12:42 am on Sep 10, 2013 (gmt 0)

[26/Feb/2012:02:23:32 +0000] "GET /downloads/setup_akl.exe

[14/Jun/2004:18:12:14 -0700] "POST /_vti_bin/shtml.exe/_vti_rpc HTTP/1.1" 404 - "-" "MSFrontPage/4.0"

[14/Jan/2006:20:09:59 -0800] "GET / HTTP/1.1" 403 - "-" "iexplore.exe"


But those examples were file requests, not user agents, which is the topic per the OP's first line in her post.


I beginning to believe you just like to argue!

Considering the quantity of saved lines (whether UA's, Ip's or logs)at my acceess and the degree of variance for those three (2004-2012 (2013 if you count the current years zilch) meager references, they answer lucy's question without a long-winded explanation such as this ;)

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4608564 posted 12:56 am on Sep 10, 2013 (gmt 0)

lucy,
I had approximately 30 file requests in my references for an exe file (self-extracting ZIP) that I've on one of my websites.
These came up in my data search (bad boys directory) for ".exe", however I excluded them.

keyplyr

WebmasterWorld Senior Member keyplyr us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4608564 posted 2:55 am on Sep 10, 2013 (gmt 0)

I do not offer user-side .exe files on my site. Any request for them is erroneous or part of a malicious script injection. I've never seen a valid reason iexplore.exe (or any other reference to an .exe file) should be present in the UA of a normal web site visitor, so I've always blocked .exe in any use.

thetrasher

5+ Year Member



 
Msg#: 4608564 posted 11:13 am on Sep 10, 2013 (gmt 0)

"xpymep.exe"
= XRumer

"bpgrupy.exe"
Probably XRumer

"xrumerguestbook1.exe"
...

"start.exe"
= XRumer
lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4608564 posted 9:55 pm on Sep 10, 2013 (gmt 0)

:: detour to search engine ::

Oh.

Question that often arises in similar situations: What's it doing on sites that haven't invited it? In the present case, logs make it pretty obvious that it's just one element of a ua-spoofing package. But why would they* assume that they get a free ride?

I do not offer user-side .exe files on my site.

... and I've got a Mac, so that goes double ;)


* "They" = assorted SEO-related entities, whether as UA or IP.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved