homepage Welcome to WebmasterWorld Guest from 54.196.197.153
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
UP.Browser
lucy24




msg:4601150
 12:01 am on Aug 12, 2013 (gmt 0)

Trying to figure out if I've got a time-traveling robot, or a human with an unbelievably old cell phone. Look at this (complete) series of log entries:

209.59.13.dd - - 13:38:46 "GET /ebooks/ HTTP/1.1" 200 7980 "{legitimate referer for this page}" "r451[TF268435460205204939000000018645696643] UP.Browser/6.2.3.8 (GUI) MMP/2.0"
209.59.13.dd - - 13:38:48 "GET /ebooks/ebookstyles.css HTTP/1.1" 200 4778 "/" "{same UA}"
209.59.13.dd - - 13:40:02 "GET /silence/ HTTP/1.1" 200 2388 "../ebooks/" "{same UA}"
209.59.13.dd - - 13:40:03 "GET /silence/silentstyles.css HTTP/1.1" 200 2286 "/" "{same UA}"
209.59.13.dd - - 13:40:51 "GET /paintings/tundra/home.html HTTP/1.1" 200 1493 "../../silence/" "{same UA}"
209.59.13.dd - - 13:40:52 "GET /paintings/paintingstyles.css HTTP/1.1" 200 11861 "tundra/home.html" "{same UA}"

The UP.Browser was apparently state of the art for cell phones in 2006. (I looked it up.) If the subject line of this post seemed familiar, it may be because it is still one of google's three mobile UAs:
SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)

Today's IP seems to belong to a hosting place (Windstream, 209.59.0-31) but the only other hit I find from the same range is, what do you know, another UP.Browser. Very similar, but not identical. That request was for an image file from google search. Same goes for all other visits with this UA-- until this one.

Look again at the referers.
GET /ebooks/ <snip> {normal external referer here}
GET /ebooks/ebookstyles.css <snip> /
GET /silence/ <snip> ../ebooks/
GET /silence/silentstyles.css <snip> /
GET /paintings/tundra/home.html <snip> ../../silence/
GET /paintings/paintingstyles.css <snip> tundra/home.html

All those ../ elements are literal text from logs. Look carefully and you'll see that each translates as the relative path of the referring page as viewed from the requesting page. Except that they seem to have been confused about how to reference /dirname/ from /dirname/dirstyles.css (unless leading slash has no special meaning in this browser?).

Someone with a long memory: Is that how the UP.Browser normally sent referer strings? And was html + css (only) its normal pattern of requests? All requested pages have at least one image file-- and in one case an embedded font-- along with .js (for analytics). The noscript version of analytics is hidden in an <img> tag, so it's understandable that this line is missing. No icon request of any kind.

I Am Stumped.

 

wilderness




msg:4601304
 3:19 pm on Aug 12, 2013 (gmt 0)

lucy,
I've just under 20 UP.Browser items saved from 2002-03.
NONE contain a refer.

The majority of these were Google's WAP Proxy.

lucy24




msg:4601326
 4:36 pm on Aug 12, 2013 (gmt 0)

saved from 2002-03

Typo for 2012-13, or is the UA even older than I thought?

When I looked up the UA in past logs, almost all-- excluding googlebot-mobile-- were requests for images alone. The referer is "http://www.google.com/imgres" et cetera. How they achieve image search when this most recent request didn't even ask for images must remain a mystery.

The referer for the first page is my profile on a closed (that is, non-indexed) forum. All subsequent linking is valid: that is, page A has a link to page B which in turn links to page C, and each has its own stylesheet as shown.

... and the paintingstyles.css file is WAY too big in proportion to its body pages-- mainly due to a slew of colors that are determined by overall body background color. I should do something about this.

wilderness




msg:4601356
 7:24 pm on Aug 12, 2013 (gmt 0)

saved from 2002-03

saved from 2002-03

saved from 2002-03

saved from 2002-03

saved from 2002-03

No typo.

wilderness




msg:4601361
 7:51 pm on Aug 12, 2013 (gmt 0)

lucy,
I've explained this previously (and before some noob asks again there's no way to share the data, nor am I inclined to spend time putting it in a database; it's easily searchable in it's present format using Copernic), I make notations and have since 1996 or 97, of visitors and log lines.

Some notations might include a deny or other type of solution. Some might simply be defined as a "probe" and saved for notation.

Each and every notation contains a long line and path references to pages and files within my sites, thus any sharing of data would require removal of the path references, which I'm not about to spend time doing.

On the other hand, if somebody laid a BIG pile of cash in front of me. . . .

lucy24




msg:4601386
 9:06 pm on Aug 12, 2013 (gmt 0)

Wow. Didn't realize the UA even went back that far :) My own data collection only goes back about two and a half years (Feb/March 2011). And only the last year-and-a-bit is available for immediate searching.

Within that time period, the first appearance of the UP.Browser (in any form) was near the end of 2011.

I don't know what the story is on Googlebot-Mobile. UP.Browser seems like it would be an older UA, but I never met it before the end of August 2012; before that I only got the SmartPhone.

:: detour to check ::

Huh. The third mobile UA, DoCoMo, seems to have shown up at exactly the same time. Wonder if the SmartPhone version found something that triggered closer inspection? Maybe that's when I started making the CSS more responsive.

So when you say 2002/03 do you mean only from that time period, or dating back to that time period?

wilderness




msg:4601391
 9:27 pm on Aug 12, 2013 (gmt 0)

So when you say 2002/03 do you mean only from that time period, or dating back to that time period?


My god girl!
Do you enjoy giving me a difficult time or are just having a "blonde moment" ;)

The files are dated by the date I saved them on my computer.
The log lines included in same might not correspond to the saved date and/or depending upon the time I reviewed the logs.

yaimapitu




msg:4601449
 5:51 am on Aug 13, 2013 (gmt 0)

Both Up.Browser and DoCoMo are still showing up frequently in UAs in Japan as of 2013, in various configurations. Given the IP address, I wonder whether you may have encountered a visitor who brought a Japanese mobile phone along some time ago. Also, does "{legitimate referer for this page}" possibly give a hint as to what the person was looking for?
If the IP address is not a dynamic address (address for customers of that ISP) , I'd think someone is playing tricks, though...

lucy24




msg:4601459
 7:19 am on Aug 13, 2013 (gmt 0)

Dear Don, it has been many many years since I was blonde :P

Really, seriously, I was double-checking "from 2002-03" since the word "from" could legitimately be taken either way.

does "{legitimate referer for this page}" possibly give a hint as to what the person was looking for?

I mentioned this a little ways down the thread:

The referer is my profile on a closed (non-indexed) forum. All subsequent linking is valid: that is, page A has a link to page B which in turn links to page C

So by behavior alone, it would be utterly human-- except, #1, who reads message boards on a cell phone? Composing a reply would take so long, you might as well wait until you get home to your keyboard. And, #2, the link from B to C-- I just remembered this-- is an image with a one-word alt. Maybe the browser puts it in a different color? Can't think why else they would have clicked.

The only other hit I can find from the same range is an almost-but-not-quite identical UA, in the same /29 subsector. This belongs to Comverse, which I've never heard of but appears to be a mobile/ telecommunications place. So, yeah, a sublet to a cell phone provider. Wish they wouldn't hide in the middle of a hosting range, though; that's the kind of thing that can get you locked out by mistake.

Still think the format of the referers is weird. Why can't the browser simply send the full URL of the page it's on, instead of doing that extra backward-calculation stuff?

bhukkel




msg:4601464
 7:42 am on Aug 13, 2013 (gmt 0)

I am more into subnet rating and this subnet is very clean:

Number of domains hosted: 894
Number of adult domains hosted: 16
Number of nameservers hosted: 35
Number of SPAM hosts hosted: 1
Number of open proxies hosted: 0
Number of malicious threats hosted: 0

Also the history of this subnet is good.

dstiles




msg:4601603
 7:46 pm on Aug 13, 2013 (gmt 0)

bhukkel - Not sure if you are attempting to validate the server hosting company but the point is: servers should be blocked regardless of "cleanliness".

Currently I'm seeing masses of junk from server farms that may or may not be basically "clean". The problem is that servers can become infected the same as "home" broadband-based computers - current infections seem biased towards joomla - but in any case servers have no need to "read" another server (eg my web server) unless they carry an approved and valid bot.

So: if it's a server it gets blocked. :)

yaimapitu




msg:4601696
 2:51 am on Aug 14, 2013 (gmt 0)

So by behavior alone, it would be utterly human-- except, #1, who reads message boards on a cell phone?


If it's a fake UA it's not a cell phone, and for all we know the interested party could be the NSA (LOL)

lucy24




msg:4601749
 7:57 am on Aug 14, 2013 (gmt 0)

Now we're getting circular :) Sure, some humans send fake UAs. Although I suspect 99% of those humans are WebmasterWorld readers. But they wouldn't pretend to be an outdated cell phone; that's a way to get blocked. Humans trying to be discreet will either send something hopelessly generic like the current Chrome, or a string of complete gibberish that says "I'm wearing a mask and I don't care who knows it because you still can't see who I really am".

In this case, IP and UA do seem to fit together.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved