Forum Moderators: phranque
Now, I wonder if a request for a specific file (any URL) called "ga.js" could be redirected to the real path which is:
[google-analytics.com...]
Would that work since the Google Analytics works on a client side and since referral comes from the page that was loaded on a client side?
I'm thinking that even with redirection which would resolve 404 problem, we may still be short for that GA tracking part.
What do you think?
Thanks
That's all I can comment on, as the how/why of your clients attempting to load the analytics script from your site doesn't mean anything to me -- I don't know why they'd be doing that, or what the effects on the analytics data would be. While another member reading in this forum might know, I suspect that that is a question for another forum.
Jim
Thanks very much for your posts - they both sound just like what I have on my side, it was just that I'm not that savvy - to make the difference between real users and bots every time. I did see though that some of these 404s were happening on page after page with no pause which could not be a live user.
What do you do about this?
BTW, in .htaccess, how do I set the file name for redirect that would mean ga.js in "any folder", including root?
Is ga.js$ enough? Will that pick any path? Or, the full line would be:
RewriteRule ga\.js$ [site.com...] [R=301,L]
Thanks
Then there are the legitimate amateurs who want to build a search engine, but who lack the technical knowledge to fully-implement the HTTP/1.1 protocol and all of its requirements. Let's face it: Hardly anyone reads the specifications [w3.org]... :(
Jim
/google-analytics.com/ga.js
~OR~
/dirA/google-analytics.com/ga.js
/dirB/google-analytics.com/ga.js
~AND/OR~
/__utm.gif
(that's a double-underscore)
Now for the apparently not-just-coincidental part. Those specific files are always and only requested by users of a single ISP:
direcpc.com
Some months back, I e-mailed the legit poster asking if they knew what might be going on -- if they had some sort of plug-in or add-on (UAs: MSIE; FF) or always came from a specific link-to or a direcpc.com bookmark or whatever. I tried sparing them too much geek-speak but knew I'd failed when they pretty much replied, "Huh-wha?"
: )
If this isn't old news... Anyone else see something similar only with direcpc.com?
FWIW #2: An occasional 'plain' ga.js (/ga.js) URI typically comes from, oh, Afrinic hosts and the like, or chello.nl (a chronic source of all manner of troublemakers).
You may be on to something there. The common factor you are likely seeing is that these requests come from satellite-link ISP users.
DirectPC, Hughes, Starband/EchoStart/Gilat, etc. use proxy clients at their on-the-ground network operations centers to interpret HTTP pages and pre-fetch all the included objects (e.g. images and external JavaScripts, etc.). Then they send the page and all its objects to their client user via the satellite link.
This is done because otherwise, the page would be sent to the user's browser, then the user's browser would have to send subsequent requests for all the images and scripts on that page back over the satellite link. Considering that the time to send their requests up to the satellite and then back down to their ISP's ground station is .480 seconds, it's easy to see why these satellite ISPs want to pre-fetch stuff for their clients, and so use client proxies.
However, these client proxies are often buggy, and you'll sometimes see them fake-up referrers or just generally screw up. For example, if you redirect from non-www to www, you may still see them try to fetch images using the non-www as a referrer, because their client proxy hasn't actually followed the redirect before trying to fetch the images.
It's all very complicated, and it makes my brain hurt. But the bottom line is that satellite ISPs need a bit of 'forgiveness' in access-control filters.
Jim