Forum Moderators: DixonJones

Message Too Old, No Replies

Hitslink going wrong?

That's The WPG one...

         

Receptional

5:23 pm on Jan 20, 2003 (gmt 0)



Those of you that use the WPG tracking system are using Hitslink. We are doing some worrying here...

In late November, Hitslink updated its software, but it now seems to be poorly identifying a daily unique - something it did quite admirably before.

This was really expensive for us for a while - we were getting visitors via PPC at one rate and LOSING MONEY because not all visitors were getting charged on to the customer. We are still trying to figure it out, but here's what we THINK the new system may be getting wrong:

If two people arrive on the same IP within 24 hours, they are identified as the same user. This means a University or company may well only be identified as one user. Previously it seemed dependent on the cookie.

Previously this did not seem to be the case - we are comparing with other log systems and have not yet decided it is Hitslink or new IE6 default cookie settings but either way, it's a problem.

bingymon

7:44 pm on Jan 20, 2003 (gmt 0)

10+ Year Member



What do hitslink say about P3P compliance for their cookies in regards to IE6 cookie preferences?

fathom

7:27 am on Jan 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If two people arrive on the same IP within 24 hours, they are identified as the same user. This means a University or company may well only be identified as one user.

hmmm... networks, and proxies.

100 visitors on 100 workstations through the same IP would be 100 daily uniques.

Receptional

9:21 am on Jan 22, 2003 (gmt 0)



<100 visitors on 100 workstations through the same IP would be 100 daily uniques.>

Exactly. Only Hitslink seems to think it is one daily unique. Since the client in question pays on the number of daily uniques, based on hitslink, that is an expensive error.

<P3P> Can't see anything about this on the hitslink FAQs and have never had a technically proficient reply to support yet. Time to take my white label elsewhere methinks...

fathom

10:46 am on Jan 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Receptional I take it you are using the enterprise version since the standard package is based on pageviews.

Receptional

11:10 am on Jan 22, 2003 (gmt 0)



Yes - paying the cash. But the white label still seems to have the old interface (By old I mean two months) but still it seems the data is starting to cry fowl.

Paying the cash on a LOT of sites as well. The page views data seems accurate, but returning visitors is becoming an unrealistic percentage on most sites.

So are you saying the freebie one doesn't record daily uniques? I am sure that we do have two sites that aren't in our white label which have the hitslink logo (so presumably free) that have daily uniques recorded. I don't want to pay for bad data.

Dixon.

fathom

11:21 am on Jan 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I will admit having problem numerous times over as well but found their support staff highly receptive, working through all problems to a very high standard of professionalism.

quite rare these days as many companies once they have your money they tend not to care less - "customer service" usually being just a buzz word

Receptional

11:41 am on Jan 22, 2003 (gmt 0)



Fathom, Thanks - I will try hard again - we rarely get more than one line of reply to a suppport email, so thanks for that.

The product is (has been) excellent in many respects - I dumped Webtrends AND LiveStats to run with it. But if IE6 is changing their boundaries, then they and we need to react quickly.

Of course, the problem here is pinpointing exactly what the problem is - whether it is p3p or proxy servers or a bug in the new database.

Dixon.

Receptional

12:31 pm on Jan 22, 2003 (gmt 0)



WOW - I am gobsmacked at the reply.

To be very fair, support replied instantly and effectively. Guess what the issue was? hyphens in the account names!

So - just goes to show that trying to second guess on the web can often lead to bad assumptions.

Dixon.

Receptional

3:05 pm on Jan 23, 2003 (gmt 0)



Nope - that didn't fix it.

Here's my latest email to their support... For anyone who cares to listen.

Hitslink support:
Over the last 24 hours we have been removing hyphens from our account ids. However, this has not resolved the issue.

The issue must have arrived during the 27th of November. (Was this is the day you updated your programme?)

before that date, "New Visitors" on the repeat visitor chart was running at about 88%. By the 28th this was down to about 66%, which it continues to run at. If the hyphen had resolved the issue, then the new visitor % would have returned.

This is accross several accountIDs, The most expensive for us being XXXXX and is clearly not correct when we start seeing that our PPC traffic that we are being charged for is in excess of ALL traffic on the hitslink logs.

What next to fix this very expensive problem?

fathom

5:58 pm on Jan 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have no idea here at all Receptional, but keep working with them.

As before... Hitslinks is very receptive, and further (from my own experience) know what customer loyalty means to them.

I have on various issues received full refunds on monthly fees (3 times) and an account(s) 10% cost reduction for life due to error cause by their systems.

If you work with them - and the root cause is a problem at their end - I'm sure they will want to retain your business.

makemetop

7:45 pm on Jan 23, 2003 (gmt 0)



I am sure that Hitslink will work on this - but I'm not a customer! A client has determined that Hitslink should measure unique daily visitors and it appears to be singularly failing at the moment. As this client is Hitslink's customer (not me) I'm not sure that anyone is in any great rush to assist as under reporting is saving my client thousands a month! And I can't see Hitslink compensating me for my loss - or the client!

Anyway, I hope it is resolved soon. I am installing my own ssi tracking script - so at least if it goes wrong I've no-one to blame but myself!

Eric_Winter

6:52 am on Jan 24, 2003 (gmt 0)

10+ Year Member



Based on your description:
"The issue must have arrived during the 27th of November. (Was this is the day you updated your programme?) before that date, "New Visitors" on the repeat visitor chart was running at about 88%. By the 28th this was down to about 66%, which it continues to run at."

I agree with your assessment, Receptional. The fact that the change in your data was sudden and that it has been consistent since the time of the Hitslink update points to a change in the definition of a daily unique - if it were a change in cookie days or something the data change would have been more gradual. Can you provide any more data that would help us deduce what might have changed in the definition. Are you seeing any other dramatic data changes between the 27th and 28th, i.e. total visits.

I would generally say that the data pattern does not match an IE 6 setting since that would manifest itself in more gradual data shifting . . . one possibility is the P3P . . . if in their code update they called a new file and that file didn't have a P3P policy (Whereas the old one did.) It could suddenly have been rejected by a segment of the population. Check the http headers of the file called in their webbug to see if a compact policy is in place.

fathom

7:39 am on Jan 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



hmmm... talk about support!

Eric_Winter welcome to WebmasterWorld ;)

Receptional

11:02 am on Jan 24, 2003 (gmt 0)



Hi Eric. Glad this is being taken seriously. I think it makes sense to share the problem by sharing a set of stats. So I'll base observations around the stats at [hit-counter.net...]

I suggest selecting "expand menus" on the left before trying to follow our Observations:

Observation 1 (As described in previous posts): Select "repeat visitors" in the visitors column and see that (for example) on 23rd January 2001 the "New visitors" were 61% of all traffic. Now select 24th November 2002 in the date range. Back then new visitors were 92% of traffic! The change seemed to happen to many or all sites on or around 27th November.

Observation 2:
Select "Latest Visitors" from the visitors menu and maybe increase the list to the last 100. You will see that most visitors have one or two visits. A few have 5. But certain ones have 19 / 20 or even more. Certain colleges and if you see the NHS in the list are up in the hundreds. pol.co.uk is an AOL proxy by the way, which is one of our culprits. Hence my assumption that AOL users and others are no longer being identified as uniques - if an IP number is coming to the site many times per day it seems to be identified wrongly as a repeat visitor. We dumped webtrends for the same issue years back.

The issue only becomes more apparent on busier sites, but this one has enough traffic to make the point.

Over to the Statisticians and hitslink developers...

Dixon.

makemetop

11:26 am on Jan 24, 2003 (gmt 0)



>pol.co.uk is an AOL proxy

Er, it's Freeserve actually :)

However, your points are correct.

Apparently I'm getting someone from NTL who has now found our site from a search engine referral for the 137th time!

In fact, checking into this a little further, it appears that a staggering 20% of all my visitors are now visiting for the 26th or more time and this percentage is growing daily. Whereas it was previously always around the 1 or 2 per day figure. So, this appears to be getting worse - day by day!

Eric_Winter

1:06 pm on Jan 24, 2003 (gmt 0)

10+ Year Member



If that's the case - that the problem is getting worse and worse each day - then I think that points more to a change in the number of cookie days (or more precisely cookie expiration, since it sounds like it might have been set for less than a day before) then in the definition of a unique daily, i.e. before it was set for an hour and now is set for a month. . . or was not used at all in the extrapolation of this datum.

Receptional

2:55 pm on Jan 24, 2003 (gmt 0)



20% is about right...

Which is probably about the percentage of users in the UK using proxy based connections through major internet providers would you guess?

POL.co.uk ... I'll check when we've figured out how to fix the main issue. Should be talking to Jon at Hitslink by phone when he wakes up.

Dixon.

Receptional

5:58 pm on Jan 24, 2003 (gmt 0)



Still waiting to speak to him... 6:00 PM Friday night here now.

Hopefully he is scratching his head to work it out, but who knows.

Dixon.

melissa99

7:24 pm on Jan 24, 2003 (gmt 0)

10+ Year Member



Hello, I work for HitsLink. I do not normally monitor this forum, but I had this forum link forwarded to me from a member of our company that read this thread.

I appreciate reading an open discussion on HitsLink - It gives me more insight to what our users are doing and what issues they face.

I would like to clear up one important thing - the price is based on page views only, not visitors. If there is an issue with higher visitor counts, it will not affect your price.

Also, you are correct about the algorithm change in November. We have a multi-step process to determine uniqueness, which I will explain here:

- Every page view drops a cookie with a unique visitor id
- On subsequent page views, we use the cookie value, if it exists
- If the cookie does not exist (due to a variety of reasons), we use the IP address for 24 hours

While not perfect (it is impossible to be perfect in this regard), it is a method we adopted due to:

- The proliferation of cookie-blocking firewalls
- The increase in cookie-blocking browsers

The toughest situation to detect unique visitors, which is unfortunately becoming more prevalent, is when an organization has both a cookie-blocking firewall AND uses a proxy.

The last time I checked, cookies were being blocked on around 6-7% of visitors, which is a significant increase.

I would like to take a look at some of your accounts to view the behavior - could you reply with your account ids?

Receptional

10:47 am on Jan 27, 2003 (gmt 0)



Melissa, thanks for dropping in.

I have emailed the main account IDs that are hurting through the internal email system.

You said: Yes the algorithm changed and you measure dialy uniques by:
- Every page view drops a cookie with a unique visitor id
- On subsequent page views, we use the cookie value, if it exists
- If the cookie does not exist (due to a variety of reasons), we use the IP address for 24 hours.

If that is how you measure it now, how did you measure it before the algorithm change? For quieter sites this method will work, but when the traffic hits 4,000 page views per day, all targeted at UK users, then treating an IP number as a unique users starts to become absurd, as many MAJOR ISPs re-use the IP number time after time as a dynamic IP. Moreover, (and I have to guess here) when someone clicks on a PPC result from a populer search, some ISPs seem to think hmm - cached on this IP number - I'll deliver the cached version. The result - we record an extra page view but not an extra visitor. Whether my interpretation is right or wrong, it is clear that on busy sites the stats are significantly out, whilst on quiter sites the error is not noticeable.

You also pointed out that this discrepancy was not costing us money. I can assure you that I know of five companies including ourselves where it is costing us thousands of dollars since we made deals with clients based on the old algorithm. I accept that you charge on page views, not uniques. However, WE are obliged to charge on daily uniques. This means we are now paying for Overture and similar traffic that we cannot charge on. Since Overture traffic almost always comes from the very ISPs that are using dynamic IPs, You can see the issue.

I would have thought that treating an IP number as unique for a shorter period would at least be more accurate. However, on really heavy sites I bet this would fall apart totally. I can't see the stats for MSN, but my guess is that some IP numbers are continually hitting the site even though every other hit is from a different user logging on with the same default home page. Makemetop's 13& visits from an NTL user VIA the same search engine referral seems incredibly unlike a cookie tracking issue.

Perhaps another question in your logic is this - if about 6-7% of users are not accepting cookies, why has the daily "new vistors" as a percentage of "returning visitors" dropped for high eighty percents to mid fifties percents?

Anyway - can you confirm how the new algorithm has changed from the previous alogorithm please? At least then we can have meaningful discussions with our clients about how we re-set the value of a daily unique.

Dixon.

makemetop

11:18 am on Jan 27, 2003 (gmt 0)



>Makemetop's 13& visits from an NTL user VIA the same search engine referral...

My point was that it was 136 visits from an NTL user using different search engine referrals - hardly likely, I think! More likely to be 136 NTL different users coming in using the same IP over a 24 hour period.

Receptional Andy

11:32 am on Jan 27, 2003 (gmt 0)



>While not perfect (it is impossible to be perfect in this regard), it is a method we adopted due to:
>
>- The proliferation of cookie-blocking firewalls
>- The increase in cookie-blocking browsers

A couple of points on this. We were able to identify the time of the problem as at the end of November, as this is when the stats started to become inaccurate. We were right in this supposition.
So, what changed in November? And whatever it was that changed, were the improvements enough to compensate for the fact that a number of people are now seeing inaccurate numbers?

When I first tested hitslink, it's ability to track proxies and unique visits accurately (as opposed to server log files) was what impressed me the most. This is now the very thing that's going wrong. My conclusion would be - use the old algorithm again, because it seemed to be better at achieving the objectives mentioned above than the new one...

NFFC

11:37 am on Jan 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>My point was that it was 136 visits from an NTL user using different search engine referrals - hardly likely

The NTL users all come through a Inktomi proxy server which only shows the servers IP address not the user, could this explain it?

Receptional Andy

11:44 am on Jan 27, 2003 (gmt 0)



The Inktomi proxy was what I though of. It must depend on connection type though, because their are also users from NTL.com who don't show this repeat visitor problem

NFFC

11:47 am on Jan 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>connection type

At least all the cable modem users use the proxy for sure.

Receptional Andy

11:53 am on Jan 27, 2003 (gmt 0)



Weird thing is, we are currently using an NTL cable modem connection - visits from PCs on our network show up as repeats, but we're definitely not added to the total for other NTL cable users.

Receptional

11:53 am on Jan 27, 2003 (gmt 0)



Makemetop - I have similar anomolies. On my ferry site I just selected "repeat visitors" then selected "Month" (January). The table is:


Number of Visits--------Daily Unique Visitors
New Visitors------------11167
From 2-5 Visits---------2606
From 6-10 Visits--------569
From 11-25 Visits-------633
From 26-100 Visits------2931
Over 100 Visits---------0

So how can it be that I have 633 daily uniques this month by peopel comin up to 25 times but 2931 from people ccoming 26 times plus. Can only be repeating IP numbers from difrent users. Statistically implausible and I am betting that 100 days after 17th November 2002 that final "over 100 visits" will start jumping up day by day unless hitslink go back to the old calculation, whatever it was.

Of course, that 2931 is almost certainly many more than 2931 daily uniques who have only visited once or twice, but as the "repeat count" is set to 24 hours, we have no way to extrapolate back.

The comparable table for November was MUCH more believeable:

Number of Visits------Daily Unique Visitors
New Visitors----------18417
From 2-5 Visits-------2435
From 6-10 Visits------138
From 11-25 Visits-----196
From 26-100 Visits----83
Over 100 Visits-------0

Statistically speaking a lovely looking normal distribution curve I'm willing to bet.

Ergo - null hypothesis: The algorithm was correct before the update and is incorrect now.

Dixon.

makemetop

12:55 pm on Jan 27, 2003 (gmt 0)



Doing the same my figures are:

January:

New Visitors 8726
From 2-5 Visits 1795
From 6-10 Visits 416
From 11-25 Visits 851
From 26-100 Visits 2094
Over 100 Visits 37

November:

New Visitors 17903
From 2-5 Visits 2112
From 6-10 Visits 116
From 11-25 Visits 148
From 26-100 Visits 397
Over 100 Visits 0

Something is certainly wrong here!

melissa99

4:05 pm on Jan 27, 2003 (gmt 0)

10+ Year Member



Yes, I can see that the numbers appear off. I will dig into this and see what I can find out.
This 35 message thread spans 2 pages: 35