Forum Moderators: DixonJones
I hope not.
Remember the AVG LinkScanner debacle? I would guess that there is a similar problem here if you have a lot of followers on Twitter and they all use this prefetch system to expand short URLs.
This isn't restricted to users on Twitter, and it applies to any site that has a redirect pointing at it via a URL shortening service, with the short URLs being posted to multiple other sites, or being posted to some sort of feed/stream that is read by a lot of people.
Your site stats could show that you got a lot more visitors than you actually genuinely received.
A service that just shows you what the expanded URL is, is likely to be safe.
However, recently I am seeing services that also tell you what the title of the page is. In that case I would guess that it has to prefetch it, to find out.
I'd like to think I was wrong.
Anyone done any testing with different services?
There's a bucket load of those out there: many as browser extensions, GreaseMonkey scripts, and JavaScript bookmarklets, with others as stand-alone AIR, or Java, or other platform, Apps. There's no separate UA to detect in the request, so this isn't easy to detect like the AVG LinkScanner mess was.
It has the power to be just as devastating as the AVG problem though.
For instance, power twitter follows all the tiny links on your page and displays the actual page title so if you have 100 power twitter users following you you'll always get 100 hits from the get-go, so you actually need to get an average baseline of the automated tools and deduct it from your stats for that link.
Total mess as the stats are totally useless for the most part.
So, how is this different to the mess caused by AVG LinkScanner?And, why no widespread outrage with this? Surprised this thread has had so few replies..
This differs because:
a) AVG was scanning everything in search results driving traffic on tons of sites in anticipation of a visit
b) Most often the use of the tinyurl's are by the request of the site owner, or someone trying to drive traffic to the page, it's not random whatsoever.
Sure your stats are still off, but your stats are bogus anyway, even if you use Google Analytics which these tools don't impact but screen shot tools do fool.
True clean analytics is a pipe dream in the current internet, you can get close but no cigar.
Yes, LinkScanner scanned all the results so you got a 'visit' even when the person didn't visit.
In the same way, if someone puts a short URL in a message sent on Twitter, all of the twitter clients receiving that message would expand the short URL thereby registering as a 'visit' even though the person didn't click the link and actually visit.
Obviously there's not many people using URL-expanders at the moment, but the minute these are added as core functionality this problem will become extreme.
Imagine someone like StephenFry sends such a link - within minutes you might register more than a quarter of a million visits. Could most servers cope with that load, and just how many of those were 'real' visits anyway?
Another problem is Google's prefetch for the top search result's listing. Anyone analyzing Apache logs will be affected --- it doesn't look like IIS is, or tags. In order to see which hits in your logs are prefetched, you have to modify Apache's logging otherwise they are indistinguishable from human hits.
We analyze stats for some sites that consistently appear in the top slot AND have PPC ads running next to them. Anybody clicking on one of the PPC ads will show in the logs as having clicked on both listings. Anybody clicking on the prefetched natural ad will show as two hits to the same destination URL. Anybody not clicking at all will show as a hit and a visit.
Early indications are that about 3.5% of all our hits come from this Google prefetch. Not sure yet what the effects are on visits.
Anyway, as I said, it looks like Apache logs are affected and nothing else. I wonder if Google knew this when they started the prefetch feature.