homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

Yahoo phone home

 9:06 am on Nov 9, 2013 (gmt 0)


Another in the long list of things I found while looking up something else: - - [08/Nov/2013:08:47:44 -0800] "GET / HTTP/1.1" 200 6386 "-" "Mozilla/5.0 (SymbianOS/9.3; Series60/3.2 NokiaE5-00/101.003; Profile/MIDP-2.1 Configuration/CLDC-1.1 ) AppleWebKit/533.4 (KHTML, like Gecko) NokiaBrowser/ Mobile Safari/533.4 3gpp-gba" - - [08/Nov/2013:08:47:44 -0800] "GET /favicon.ico HTTP/1.1" 200 606 "-" <snip> - - [31/Oct/2013:04:19:02 -0700] "GET / HTTP/1.1" 200 6386 "-" "Mozilla/5.0 (Series40; NokiaC3-00/so6.96; Profile/MIDP-2.1 Configuration/CLDC-1.1) Gecko/20100401 S40OviBrowser/" - - [31/Oct/2013:04:19:02 -0700] "GET /favicon.ico HTTP/1.1" 200 606 "-" <snip> - - [23/Sep/2013:09:05:19 -0700] "GET / HTTP/1.1" 200 6282 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 6_1_4 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10B350 Safari/8536.25" - - [23/Sep/2013:09:05:19 -0700] "GET /favicon.ico HTTP/1.1" 200 606 "-" <snip>"

Those are the complete requests. Front page, favicon and that's all.

Q.: What does a cell phone want with a favicon?
A.: It isn't a cell phone. It's Yahoo.

Wait, the plot thickens.

Request from 23 September (and, incidentally, I hope it was IncrediBill that once posted the log-headers code, because if not, someone out there is getting cheated of some extravagant gratitude):

31 October:
X-Forwarded-For:,, (yes, three of 'em)


I have never met 66.196.66 in my life, but it's Yahoo. is Saudia Arabia. is Nigeria. is from a subsector of Amazon AWS that identifies itself as Nokia. is Albania.

Q.: Under what circumstances would a single request carry more than one X-forwarded-for address?

Before this cluster, the only other I can find is from spring of 2011.

Addendum: In the course of looking this up, I found three occasions in October-- none before or since-- where Yahoo! Slurp (by that name) asked for /SlurpConfirm404/any-old-random-garbage-here.htm. Officially this is its version of a 404 confirmation (same thing google does when you've redirected too many pages at once and it gets suspicious). Wouldn't you expect a major search engine to know that a 403 trumps a 404? Possibly it thinks every [F] is preceded by a -f test.


It would also appear to be another in the long list of things I learn all about and then promptly forget. Here are some earlier threads poking at the question from various sides:

August 2009 [webmasterworld.com]
October 2010 [webmasterworld.com]
March 2012 [webmasterworld.com]
April 2012 [webmasterworld.com]

My impression is of a proxy rather than a bot.

--dstiles, October 2010, and that's without even mentioning those X-forwarded-for headers


Final Q.: wtf?



 12:51 am on Nov 14, 2013 (gmt 0)

Some things are better left alone not-understood ;)

I've some very, very old uniform browser lines that stops this thing and quite a few others dead in their tracks.

There are times (especially with the many variations of MSN browsers) that the old lines also stop legitimate visitors, however attempting to keep-abreast of all the MSN browser UA's is nearly impossible as trying to keep-abreast of the web-access-Yahoo-email-variations. (IMO both tasks are a complete waste of time).


 8:15 pm on Nov 14, 2013 (gmt 0)


I have - down as proxy IPs - ie anyone can use them. Specifically, rDNS has a PTR of mobile,bf1 name. Which accounts for the mobile UA. is an Albanian satellite service so that ties in (I have - noted as mobile proxy IPs, though there may be a few holes in that range).

I suspect your second, triple-FWD example is someone in Nigeria trying to get through an AWS proxy in some way (I block all amazon proxy attempts with a warning not to be stupid). I seldom see triples but several FWD IPs is feasible. Sometimes that can be legitimate: some countries have to finesse routing in order to avoid gov blocking.

I often see servers trying to get access via proxies, either legit ones or compromised DSL ones. Many are blocked at first site, especially if one of the IPs is a known rogue.


Next April should see the end of ALL MS UA's before and including IE6 since XPs are no longer supported by MS from that date (except that firefox and G-chrome will support it, which is dumb but doesn't matter in this context). I would also say that from that date anything labelled Windows 5.1 or earlier can be rejected and, from spring 2015, Windows 5.2 (2003 / XP x64). Of course, there will be idiots still using XP far past that date but one has to draw the line somewhere, and it's likely that such machines will be compromised soon after April 2014/2015.

At present I'm thinking of including a polite version of the text "Upgrade, dummies" at the top of pages with those detections - and perhaps some earlier firefox, safari, chrome etc browsers.


 10:31 pm on Nov 14, 2013 (gmt 0)

Of course, there will be idiots still using

Also people of normal-or-above intelligence in areas with predominantly indigenous populations. In the US, this particularly means reservation schools; there are equivalents in other countries. This sector includes a significant part of my target audience,* so I am not prepared to risk locking out even one human.

:: shuffling papers ::

Thought so. Most recent human visitor matching the string "MSIE 7.0; Windows NT 5." was only about a month ago. Iqaluit, probably a governmental office.

Albania, otoh, can be fully human for all I care. I think I recently blocked a slab of Moldova too.

* Where "target" = the ones I value most, not necessarily the most numerically significant.


 9:30 pm on Nov 15, 2013 (gmt 0)

IE6 was dumped officially by MS several years ago. Yes, I still allow it but I'm having serious thoughts about adding a warning to pages at the very least.

I'm fairly sure that IE6 is the latest version that Windows 2000 (NT 5.0) can run: if it says it can, I think it's lying. On the other hand you do not give the decimal value, so you could be talking about XP or 2003.

I have two Windows 2000 machines here - 3 until a few months ago - and they are stuck with IE6 and Firefox 12 as the last possible browsers. Which is why only one of those machines ever goes online and that will end as soon as my wife can be persuaded to use linux. It's an uphill battle because of other software she uses. :(

If schools whatever are still using machines/browsers online that old after next April they deserve viruses; they should have updated years ago anyway. Or better, switched to linux, which should do most things they would want to do and would prepare them for the outside world :(

Am I coming across bitter? :)


 10:59 pm on Nov 15, 2013 (gmt 0)

Am I coming across bitter?

No, you're coming across as someone who can afford a new computer and is in a position to install his own software.

MSIE <= 4 and FF <= 2 get an unconditional block, along with most of MSIE [56] if UA claims to be a Mac. MSIE [56] and Firefox (3\.[0-5]|[567]) in most circumstances get redirected to a special "let me know and I'll poke a hole for you" page. Some time in 2012 Iqaluit finally upgraded their MSIE 6 machines; that's when I decided it was safe to block. Or at least redirect. FF 3.6 includes Camino; FF 4 is rare but does occur.


 8:54 pm on Nov 16, 2013 (gmt 0)

The last computer I bought, about three months ago, was a rather old second-hand XP for 70; immediately converted to Linux Mint to become a backup postfix mail server. I'm very "careful" with money. :)

I still see FF 3.6 from linux but there is no real excuse for linux to run old software: updates are available as soon as they are "fixed" (for bugs) or upgraded. In any case FF updates are easily notified and updated no matter what platform, so a lot of versions are blocked here, some silently and some with a brief reason.

I don't know much about Macs but I assume similar for those. Come to that, there is no real reason why MS software isn't upgraded: there are new issues every month and not all of them are bug fixes.


 10:12 pm on Nov 16, 2013 (gmt 0)

Come to that, there is no real reason why MS software isn't upgraded:

That isn't exactly so.

On two computers I've XP 32 which cannot be upgraded to SP3 (it presents conflicts with crucial software and hardware).
IE 8 or higher requires SP3 on XP 32.

On a third computer I've XP 64 and there have been some conflicts there as well (64 wasn't exactly mainstream). 64 also only offers updates to SP2, which most software's that require SP3 upgrades will not recognize.
There are some rare Hot Fixes under Windows 7 that repair restrictions in XP 64, however they are difficult to locate and even chancey to use.

I've a 4th computer (laptop) that has XP 32 as well.

Changing four computers to Linux or newer versions of Windoze would be a real chore. In addition, my crucial softwares and hardwares may not function properly (as they don't with XP32 & SP3, thus why risk it.


 11:27 pm on Nov 16, 2013 (gmt 0)

The UA string for Camino unfortunately includes the element "like Firefox 3.6".

:: shuffling papers ::

Gecko/20120308 Camino/2.1.2 (like Firefox/3.6.28)

I think the rest of the UA string pertains to the OS. Or at least the OS as the browser understands it. The UA string sent by MSIE 5 says PPC, presumably because it doesn't know from Intel.

I try to express most UA blocks as quick-and-simple strings in mod_setenvif, as it's less work for the server than evaluating piles of Rewrite Conditions. Opera gets a free ride unless it claims to be the "Bork edition", and so far I haven't bothered to check on Chrome at all. I do filter out some garbage strings involving AOL and, uh, I forget the rest. But over time, non-IP-based blocks tend to become redundant. I see this in the test site's logs, where requests from MSIE 3 with .su referer still get blocked, even though all they've seen is the shared htaccess file with IP lockouts.

a rather old second-hand XP for 70

Does that include shipping? I find a recurring problem in communicating with Europeans is that they simply have no concept of physical distance. If something is an hour away by road or rail, it's considered "isolated". I once tried and failed to make someone from the Netherlands understand that voting by satellite phone is sometimes the only option.

Anyway, if you work in a government office you're stuck with what they give you. You may not even be allowed to bring in your own machine. (This actually happened at my son's high school when the physics teacher wanted to upgrade some computers at his own expense.)


 8:11 pm on Nov 17, 2013 (gmt 0)

Wilderness - Wasn't quite what I meant: ALMOST everyone can update monthly (by which time several new zero-day exploits have ravaged many machines). If there is a good reason for not updating windows regularly (as in your case, software incompatibilty) then keep those machines off the internet and use only a safe machine online - easily done via firewalls if part of a larger network. Very few people actually have real compatibility problems, surely.

Having said which, my brother and one of my clients have software compatibility issues that preclude them from either updating to a later Windows or from using linux. My proposal to both was: use linux for all online work and keep the XP machines offline to handle the special software. This will improve security enormously.

Switching from windows to linux is not actually that much of a problem. There are enough desktop setups that are very similar to windows (at least, to XP) and to Mac and most people only run web, mail, word and graphics anyway. And it's easy enough to run linux (eg Mint) from a DVD drive without upsetting windows - or install as dual-boot. Join the linux forum hereabouts for details.

Lucy - camino is Mac? Never used it myself. I seldom see it in my security logs so either it gets past or no one uses it on my server. Actually I have a hole drilled for 3.6 soley because some linux users failed to update years ago. It's unsafe as a browser (full of security holes and incompatible) but that's their problem, not mine.

If a UA says MSIE 5 it's either an intruder or there is something wrong with the OS. The only thing I've seen that's remotely legit with an MSIE 5 UA is an old mobile - and I mean old.

I block several older opera UAs as well as bork; like you I haven't bothered with chrome: what self-respecting hacker would use that? :)

Non-IP-based blocks (UA and other headers) are not redundant in my view: they regularly introduce new server farms in addition to botnet activity from compromised DSL machines.

No shipping. I walked a mile down the road, bought the second-cheapest machine in the shop (a repair and second-hand sales shop I've had dealings with before) and, since I no longer own a car, asked my daughter to fetch it for me.

As to government etc establishments: security is often very lax in such places, which is why they are always becoming compromised. ANY gov or company still using obsolete and non-updated machines deserves everything that comes their way - and may actually get it, in-office and on their web sites.


 10:53 pm on Nov 17, 2013 (gmt 0)

camino is Mac? Never used it myself.

It's extremely rare, and will become rarer as it's no longer being developed. (There's a post about it somewhere in these forums.) I adopted it years ago because it had a popup blocker override when Safari had only just introduced the blocker itself. And it's got a fabulous ad blocker; it's always the first difference I notice when I use another browser. Excluding myself I've only ever seen it in logs 3 or 4 times-- and two of those were the same country on the same day visiting the same page, so they may well have been two people who knew each other :)

If a UA says MSIE 5 it's either an intruder or there is something wrong with the OS.

Once in a blue moon I use MSIE 5 to check something. It runs via Rosetta, so if I'm ever forced to upgrade OS I'll no longer be able to use it. The last Mac version is 5.2.something. Think of it as a midway point between a regular browser and Lynx, which I also use for testing.

Non-IP-based blocks (UA and other headers) are not redundant in my view: they regularly introduce new server farms in addition to botnet activity from compromised DSL machines.

I didn't mean that I've eliminated them. Only that when I find something in logs that would be blocked on UA and/or referer grounds, 19 times out of 20 it's from an IP that's blocked in its own right. I have two layers of htaccess. First a shared one using mod_setenvif and mod_authz-thingummy for unconditional blocks, as well as a few FilesMatch unconditional allows for things like robots.txt. (Also favicon and stylesheet, as it alerts me to accidentally blocked humans.) Then individual ones for each site; that's where all the RewriteRules go. Most of those are site-specific.

No shipping. I walked a mile down the road

That's what I meant. If the store had, instead, been a thousand miles away,* it would probably have cost a little more.

* I made up this number, but when I went to the globe and measured from Montreal to Iqaluit I found I was darn close. That's the shortest as-the-crow-flies distance; most would be considerably further, with fewer flights per week. Never realized that Montreal-Ottawa-Toronto are all in a line-- and all surprisingly close together-- so it was educational anyway.


 8:38 pm on Nov 18, 2013 (gmt 0)

Thanks for the camino info.

If Mac is still using MSIE 5.anything I would suggest it's dodgy. In fact I didn't think anyone used MSIE on a Mac any more. Firefox is much better/safer and even safari is better than MSIE 5.

Shipping - surely everyone buys at a local store? :)


 10:20 pm on Nov 18, 2013 (gmt 0)

I didn't think anyone used MSIE on a Mac any more

Nobody does, unless the Mac is VERY old. MSIE 5.whatsit was last produced in, I think, 2001. I don't know what Safari version you'd use if you're on OS <= 10.4. There are at least two parallel tracks: 5 for older OS (I'm on 10.6), 6 or possibly 7 for newer ones. It auto-updates periodically, so you should never see older versions of Safari unless it's a similarly ancient OS.

surely everyone buys at a local store?

Sure, if you define "local" as "less than a thousand miles away". And remember, things have to get to the store before anyone can buy them. That's why the cost of living in Hawaii is so high. Can't imagine consumer goods are all that cheap in Perth, either.


 9:15 pm on Nov 19, 2013 (gmt 0)

Thanks for the browser info, Lucy. Very useful to know what's out there.

Don't think I'll move to Hawaii, thanks. I'm in a good-size city in England - good-size by UK standards, that is - with plenty of useful shops. :)


 7:47 am on Nov 22, 2013 (gmt 0)


One is using FF 3.6 and Safari 4.1.3 quite contentedly on perfect G4 Macs.

I sample the latest Macs weekly, and not at all impressed or tempted. Only the ssd drive feature is desirable.

I almost asked at the Apple forum if 2013 Macs still run OSX 10.4 but would be mistaken for a troll.


 8:28 am on Nov 22, 2013 (gmt 0)

Hey, don't look at me, I was forced to get my current mac when the old one died. The last few before that were hand-me-downs (er, hand-me-ups?) from my son when he was in high school and bought a new computer every year. It was horrible to discover I can no longer use my mountains of lovely old Classic apps dating from 1991.* I don't know what I'll do when I am next compelled to upgrade and can no longer run Rosetta.

I got Word 6, hated it so violently that I promptly reinstalled 5.1 ... and have never got another word processor since. I now do everything in html.

 10:42 am on Nov 22, 2013 (gmt 0)

Gulp! That is impressively humbling Lucy!


 10:57 am on Nov 22, 2013 (gmt 0)

:: staring unhappily at post and realizing that the cat appears to have typed [ code ] by mistake for [ small ] ::


 8:07 pm on Nov 23, 2013 (gmt 0)

Perhaps the cat should look after your computer security, Lucy. :)

If you need a word processor look at the Open Office (now Libre Office) suite. Up to date, bug-fixed as soon as found, etc.

Since Apple launched the new Mac they have stopped updating Lion. This could prompt another threat scenario such as next April's XP-type one.

If you use the internet for anything, including web and mail, then keeping your software up to date is imperative.

Angonasec - get rid of the mac and install linux! :)

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved