Forum Moderators: DixonJones
We used ClickTrack before and it is similar situation, track 50-80%. It is normal? How much the your web analytic tool could track?
For example, differences that large would be consistent with tracking via log files and the referrer field and using the visits statistic, while the venue is tracking clicks.
Thanks for your quick response. We use tracking codes in all of our campaign URLs. So whenever, a visitor click on them and came to our site, it should be tracked. We look at visits, click-throughs and conversions ( orders) and revenue.
Take google aword for example, WT track 109% of clicks (guess this is because we are not billed for mutiple clicks a visitor clicks at a short period of time). Yet WT only tracked 73% of conversions that adword shown us for the month of May. It is similar for our overture campaign. What do you think could cause this?
You mean you place your url like http://example.com/product.php?gtse=goto>kw=key+word
where gtse indicates the Search Engine where you have advertised your website. This is different for each search engine. For example for Google USA it is goog and for Overture/Yahoo USA it is ovus.
gtkw indicates the related keyphrase for the specific ad campaign. This needs to be attached to the rest of the string.
And then what are you doing?
If the fraud clicker has disabled js and referral then you need a better tracking system and a better engg :). I know how difficult it was when we worked on it.
AjiNIMC
[edited by: engine at 2:03 pm (utc) on July 4, 2006]
[edit reason] examplified [/edit]
I utilize monster commerce for my ecommerce platform and was wondering if anyone uses any plug-in program to better analyze on site traffic.
The numbers I get from urchin that is provided by monster commerce and google analytics are very different.
Just trying to find out if there is something out there people have used and really like. Thanks.
Thank you for your help.
We put the traking link like this:
http://example.com/product.php?WT.mc_id=campaignID
For example, the link for our IPOD camaping in Google Adwords will be like this: http://example.com/product.php?WT.mc_id=googleadwordsIPOD
And we put a javascript code on our web pages for tracking. There are several options in webtrends for us to choose, such as "Track Sessions for Logs", "Track Sessions Using First Party Cookies", and "Track User Sessions using IP/User Agent".
We choose to track by using first party cookies.
About the fraud clicker, do you mean that if they diable js, we can't see those visits/clicks in webtrends report but Adwords will still count those clicks and there will be a gap? Yet, if so, I think we can't do anything about it so far.
[edited by: engine at 2:04 pm (utc) on July 4, 2006]
[edit reason] examplified & de-linked [/edit]
If somebody clicks on an ad to get to your site then goes back to the search page after their visit, and clicks again, Google counts it as two while WT counts it as one, i.e. the one that happened at the very beginning of the visit. I've seen the difference be as low as a few percent and as high as 20 percent. It's worth a look.
http://example.com/product.php?WT.mc_id=campaignID
About the fraud clicker, do you mean that if they diable js, we can't see those visits/clicks in webtrends report but Adwords will still count those clicks and there will be a gap?
Yet, if so, I think we can't do anything about it so far.
[edited by: engine at 2:06 pm (utc) on July 4, 2006]
[edit reason] examplified & de-linked [/edit]
You'd have to set up WebTrends to capture the same kind of latent effects, and they'd both have to have the same latency period i.e. number of days before the ad no longer will get credit for a purchase. Also I think WebTrends will only give credit to a visitor's most recent campaign. Google on the other hand doesn't care or know about other campaigns that the visitor may have responded to since the Google Adwords visit.
Yes, we are aware that our webtrend setting tracks last source ( the last tracking code customer clicks before entering our site) and Google adword or other campaigns tracks whoever clicks the campaign link ( and usually cookie duration is 30 days). Yet do you think this could make up to 20-50% difference in conversions?
Ah. Can you explain that more? I need education.
They visit the page using
Also these fraud clickers are sometimes hired by your competitor, who is aware of most of your campagins. You loss is his/her gain. Say if you run a campagin with overture and another with Google, these clicker army will click only few times everyday with various combinations.
To explain everything I will have to write few more para, but there is a continues fight between your tracking company and fraud clickers. We studied skewness of clocks, various component identification and many more things to track them properly.
If you have a traffic tool just trying to analyze actual visitors, and not robots, then Google Analytics and other traffic tools tracking from the web page is probably closer to the truth.
There are a lot of stealth spiders and web scrapers out there that use actual browser user agents which on a report from a native hosted web analyzer would show them as visitors, with LOTS of web pages, when in fact they are spiders and scrapers and ZERO VALUE.
Those stealth spiders and scrapers also don't run javascript, which may make you think the quantity of visitors with disabled javascript is higher than normal.
That is where we need AI, when we designed the model we separated one module as brain. Brain ate all the data and gained intelligence every single day. Only intelligent system can distinguish such visits. In fact we had various layers of intelligence defined for the brain.
AjiNIMC
That is where we need AI, when we designed the model we separated one module as brain. Brain ate all the data and gained intelligence every single day. Only intelligent system can distinguish such visits. In fact we had various layers of intelligence defined for the brain.
AI systems can't distinguish between a person and a program that has been designed to create clickstreams that people would create. This is (among other reasons) why the click fraud problem is so difficult.
AI systems can't distinguish between a person and a program that has been designed to create clickstreams that people would create. This is (among other reasons) why the click fraud problem is so difficult.
AI systems can't distinguish between a person and a program that has been designed to create clickstreams that people would create. This is (among other reasons) why the click fraud problem is so difficult.
Hate to burst your bubble, but it's as simple as interjecting a CAPTCHA of sorts after X number of page views. The bots can't respond to the CAPTCHA, they are busted, problem solved.
I've been doing this for 6 months now and it works like a charm.
Hate to burst your bubble, but it's as simple as interjecting a CAPTCHA of sorts after X number of page views. The bots can't respond to the CAPTCHA, they are busted, problem solved.I've been doing this for 6 months now and it works like a charm.
For PPC campagin there is a setting which does something similar based on different algo.
Not at my customer risk and when it comes to a general tracking tool I have not see anyone doing this. I rather go 100 bots unnoticed that troubling my one customer/visitor.
I would suggest redefining the concept of "customer risk" as allowing a 3rd party to use your own information against you in the search engines allows customers to find THEM before they find YOU. This happens every day, it's what AdSense has caused by incentivizing scraper sites to spam the search engines.
Besides, out of 15K visitors a day, less than 100 typically get challenged for behaving like a robot and most of them turn out to be robots.
Trust me, I don't want to annoy my customers either, and try to minimize it, but when you've had your big cash keywords hijacked a couple of times by AdSense scrapers or have been redirect hijacked by a proxy server, you'll be singing a different tune.
Thats just one example, there are hundereds of flaws with web analytics, even the ones using AI.
Hate to burst your bubble, but it's as simple as interjecting a CAPTCHA of sorts after X number of page views. The bots can't respond to the CAPTCHA, they are busted, problem solved.
I am referring specifically here to systems that analyze clickstreams, not systems that have different user experiences.
When I say AI, I mean a system designed to track trends and derive needed conclusions. This can never be 100% accurate but certainly can filter a good percentage.
Yes, that's what I thought you meant. There are certainly patterns that an AI system can be taught to recognize, such as repeated IPs (or accesses concentrated within IP blocks), dictionary attacks, and the like. The problems begin when the usage pattern is not so predictable, such as that which some casual user does. The fraudsters are aware of this, and design their bots to operate as if a human being was generating the traffic.
The fraudsters are aware of this, and design their bots to operate as if a human being was generating the traffic.
Yup, and I've been stopping a lot of the fraudsters, they aren't as clever as they think.
Allowing open PROXY servers and CGI PROXY sites access to your server is a major point of vulnerability as those playing games don't want to be caught so you need to block 'em all.
And proxy servers explain:
# Different IPs
# Different browsers
# JS disabled, Cookies disabled
A single person or automated process using proxies can pull this off, which is why they should be blocked.
Yup, and I've been stopping a lot of the fraudsters, they aren't as clever as they think.
I'm sure you have caught some fraudsters. What I'm trying to say is that there is fraud that noone can catch (merely by looking at traffic logs, click logs, etc.) because it looks like ordinary traffic. Perhaps you can catch them if they're engaged in some other type of fraudulent activity and the authorities are tipped off about it.
Allowing open PROXY servers and CGI PROXY sites access to your server is a major point of vulnerability as those playing games don't want to be caught so you need to block 'em all.
There are fraudsters who are aware of this, and set up their own proxy servers.
And proxy servers explain:# Different IPs
# Different browsers
# JS disabled, Cookies disabled
Are you going to block a user agent you never heard of?
I already do that, I whitelist all access to my server based on user agents and MSIE, FireFox and Opera get thru as well as 5 major search engines and EVERYTHING else is bounced.
No, I don't site around blocking things, they simply never get to my content in the first place unless I let them. I get a report of anything new trying to access my site and review who/what they are to see if I should let them in next time.
Only several major search engines and well behaved "browsers" are allowed.
There are fraudsters who are aware of this, and set up their own proxy servers.
Which is also why my web site checks all inbound IP's for an open proxy as well as blocking the known list, and I block accesses that come from many server hosting farms as well for the very fact people use cheap hosting to put various automated tasks and proxies online.
Basically, if it's coming from an IP address that humans don't use, like hosting farms, or where it's vulnerable like proxies, I block it.
I whitelist all access to my server based on user agents and MSIE, FireFox and Opera get thru as well as 5 major search engines and EVERYTHING else is bounced.
I imagine this policy would reduce some click fraud but it doesn't work in the general case, and you could very well be turning away legit, even potentially converting traffic.
No, I don't site around blocking things, they simply never get to my content in the first place unless I let them. I get a report of anything new trying to access my site and review who/what they are to see if I should let them in next time.
This doesn't scale for a heavily trafficked site.
Which is also why my web site checks all inbound IP's for an open proxy as well as blocking the known list, and I block accesses that come from many server hosting farms as well for the very fact people use cheap hosting to put various automated tasks and proxies online.
IP addresses change hands. Companies are bought and sold; infrastructures are merged. What might be registered to a server farm one day might be registered to a block of broadband users the next day.
Basically, if it's coming from an IP address that humans don't use, like hosting farms, or where it's vulnerable like proxies, I block it.
How does one determine what is an IP address that humans don't use? (I'm restricting this to the set of addresses that are on publicly routable networks, not the private or link-local address spaces.)
can be easily faked.
That's why the search engines are whitelisted by IP address.
This doesn't scale for a heavily trafficked site.
I get 15K visitors a day, and increasing, would it work for 1M visitors, no clue
How does one determine what is an IP address that humans don't use?
It's not too hard to determine these things, even with IPs changing hands, but it's too complex to go into and will be off topic for this thread.
Trust me, been doing this almost a year now, and the false positives are VERY low, definitely less than 0.1%