Forum Moderators: DixonJones
First of all some background info: we are managing the PPC campaigns for a number of other companies and we obviously need to track them in order to get ROI information, monitor click fraud etc. At present we are using a third party company for tracking but we are not entirely satisfied which is why we are considering to develop our own solution. We have experience with analysing tracking data, and some understanding of how the third party solution we are using now works. But it seems that building a tracking system, even though it looks simple on the surface, has a lot of subtleties that you need to get right.
For instance: Since we need to have a single system for tracking the websites of all our clients, it can't be hosted on the server that hosts the websites we want to track (client A might be using Apache/modperl, while client B might have IIS6.0 and ASP.NET). So we would need to host our tracking system on a separate central server, and then ask our clients to modify their pages in order to make requests to our tracking system (via img or script tags), similar to how most 3rd party tracking solutions work. This then introduces the issue of cookies. I think we have 2 options:
a. Use 3rd party cookies. The problem there is that a lot of browsers block these. IE6 blocks them by default unless you have a valid compact privacy policy. Does anyone know how easy it is to set this up so that our cookies don't get blocked?
b. I've come up with this idea, but I'm not sure whether it would work. We could ask our clients to point a subdomain to our tracking server. So www.clientA.com points to the client's web site, and tracking.clientA.com points to our tracking server. Our tracking server would then be able to set cookies from .clientA.com which would not be blocked. Would something like this work? One potential issue how to add the tracking tags in secure pages of the client's website (e.g. conversion pages). Would we need to get an SSL certificate for each client?
Another question:
As our clients are currently using the 3rd party solution I mentioned above, their URLs in the PPC engines already contain the keywords they are bidding on. So if client is bidding on the keyword/listing "blue widgets" and a user searches on the terms and clicks on our client's ad, they will end up in say http://www.example.com/?kw=blue+widgets
Now let's say that the client has added the following code in their homepage:
<script language="JavaScript" type="text/javascript" src="http://tracking.example.com/track?client=exampleA"></script>
This makes a request on our tracking server and we know that Example's homepage has been visited. How do we find out the keyword, is our only option to use the referrer URL? I'm asking because it appears that a number of firewalls etc block the referrer URL. If our tracking application was running on the same server as the client's web site we would simply look at the web server log and we would find the URL from there.
Does anyone have answers to the questions above?
Can anyone think of any other technical challenges that we may encounter while building a tracking system?
Thanks in advance, and I might post more questions here as I come up with them. Hopefully we will ask (and get answers to!) all the important questions before we actually build the system and discover it doesn't work.
[edited by: Receptional at 12:31 pm (utc) on June 28, 2005]
[edit reason] Links examplified [/edit]
We sometimes have trouble getting our clients to insert 3 lines of HTML correctly in their pages so that we can track them. Imagine what it would be like to get them to install and configure a piece of software.
If there are any products like Urchin that don't need access to the web server logs and can sit on their own server, please let me know.
I am guessing a javascript include, with a failover to a 1x1 transparent image if javascript is disabled.
Depending on the tracking service they might allow the client to copy the script to there server (gets around the cookie issue). With this script they use a can download a suplimental script with parameters in the url from the tracking server and the returned script will have an id number to be saved in a cookie created by the javascript that can be later referenced on future visits. And on further request then can download 1x1 images, using the tracking id already supplied and any additional information added as parameters.
This is just off the top of my head, there probably some logic errors but think i described the process which I have seen done in the passt. Hope this helps some.
As far as I am aware Urchin needs to be installed on the server hosting the web site so that it can access the logs. This is really not an option for us, as we need to track hundreds of sites, each hosted in a completely different environment.
Urchin can access remote logs, and import as well....you just have to specify the location, than it builds its own logs locally
thanks for the reply. We really need a solution that does not require web server logs. We are talking about hundreds of clients, and each one runs their website in a completely different way. It would cost us a lot more in development effort to build something that imports the logs of each client automatically and reliably than to build a tracking solution from scratch.
Personally, I prefer using web server logs, but that is because I have a lot of experience with them.
Are you sure that importing the web server logs from all of your clients isn't the best approach? If you stop and think about it, no matter what method you use for tracking visits to your clients, there are going to be numerous visit records from each of the hundreds of your clients. At least with web server logs, you need only be concerned about getting the data to your company and crunching it, whereas with the other approaches, you have to do that and ensure that each of your clients' pages is tagged appropriately. Also, you're dependent on the browsers used to access the clients running Javascript (and not disabling it).
The only reason I can see not using web server logs is if most of them do not use a typical log format (or one that can be easily converted into a common log format).
We faced a similar problem somewhere around last year. Our programmers were able to get around some of the problems.
But to begin with there are lots of issues you need to consider. I can understand what you are thinking (as we also wanted to track ROI etc for our clients SEO, PPC and SEO Vs PPC).
1) Make a note of all requirements. Get a good Database programmer to design database architecture from scratch. (in our case the final database design was totally different from what we began with, because with each module we realized that we needed more "features").
2) There are two approaches. Hosted Vs Log analysis. You are looking at hosted option (to consolidate all client results in one single interface). Problem is that your server should be up all the time. If its down for a minute, it will make data inaccurate.
3) Bandwidth.... Be prepared for high bandwidth and database size. We were amazed at the database size. Throw in a few websites which have more than 3000 visitors a day and you will understand what I mean.
4) You cannot track robots with hosted option (at least we were unable to track robots, unless you install scripts in server header, which is next to impossible on client websites).
5) We had a good traffic analysis solution but after that did not get time to do the ROI analysis modules. You are right about cookies problem.
If you are committed to your own analysis software, then yes, you can spend time and efforts on this project. Or else pay up $10 per account to a specialist "traffic analysis company" and give private label service to your client.
Pros of this option is less headaches, you can promote your own brand. No server or database or bandwidth problems. They are hosted by the third party solution.
Cons of this option is that you cannot decide what future "features" you want.
For us, the log files is not a problem since most of our clients have access to their log files. The problem is adding script code to the site to write out a transaction log file for ecommerce tracking.
The nice thing about Urchin is that you can create custom reports.
Instead of tagging the pages with JavaScript, tag them with PHP, ASP, etc. Then you can set local cookies while still logging remotely.
Another approach would be to do everything locally, then use a crontab to push the data to the logging server.
Cookies are unreliable enough without having to worry about the vagaries of third party issues.