Welcome to WebmasterWorld Guest from 126.96.36.199
Here's an example: MSN has an "origq" term in it's query string. It contains what the visitor searched for before the search that lead to your site. In other words, a referer from MSN has not only what your visitor searched for, but what he searched for before that. I get all kinds of useful info out of this like they people commonly miss spell words. I might see that a few visitors have changed a search word from picture to photo. Then I know that I should have the word picture in that page. Is there a log analysis product that can do that?
Here's another example: Of course, Google is not about to log in to my site like my visitors have to. But I still want Google to index my pages. So I have special code to let Google through. After picking through my logs, I figured out that people were reading my stories for free by using Google's cache. I can see this because sometimes the referer to my images comes from Google instead of one of my own pages. If I follow the referer, I see the cached version.
As for your problem with the google cache, try using this tag:
<META NAME="GOOGLEBOT" CONTENT="NOARCHIVE">
joined:Sept 22, 2002
After all. log analysers are just a glorified spreadsheets.
(Disclosure: I made a product out of mine but I know I'm not supposed to sell here. So I won't say what it is. I promise to be good. I think I'm going to like it here.)
just run a filter for lines including the string ".jpg" or ".gif" or ".ico" or ".js" or whatever other extensions are superfluous to your analysis and then delete or exclude them. Will reduce your file size by about 70% to 95% depending on how many images/scripts you have per page.
I'm really surprised there not more apps out there. I'v been looking for programs for a while, not found anything that great. The homemade scripts work well for me with Access or SQL as they tend to just log the visits/ uniques, which is all i really want, and can I run SQL queries to find out more info. The problem is the size of the data, and the referalls take some reading as the phrases are just stuck in with all the other stuff.
Personally I dont find webtrends very useful at all, it counts every single server request and can therfore be quite misleading on how buzy the site really is.
To handle bigger files, I have started to write a program in C++ to load referer text files into. One of the hardest parts is to work out the the rules for parsing each of the main search engine querystrings. Do you have some info on this?
The 'origq' is something to add to the list now :) thanks
I have a ton of data going back 3 yrs and hopefully when its done it will give me a concordance of referalls that I can process and will show me really neat stuff like
- Frequency of individual words, phrases
- Difference between search engines
- maybe even help me to guess which words to target :)
You can get an idea of which keywords, intended or not, are effective for a specific page.
So i get something like:
visitor session ip request agent referer time
After that queries, views and stored procedures give me more info than all analyzing packages in the world.