Forum Moderators: open
- When a spider visits my websites, which pages are visited per 'session'
- per page referral overview
- per page overview of keywords used in the SE
- per page overview of all known referrals
- sql query interface with some general pre-made scripts & custom scripts
What features would you, hard working SEO's, want to make your work easier? [;-)]
- sql query interface
Have you tried using the Apache mod_perl module DBI::Logger? It writes your logs to an SQL database (I use MySQL) so that you can pull arbitrary stuff from it using SQL.
I have been using it (or a similar perl module) for about a year with great results. I can pull all kinds of information out of it that you can't usually get without some really creative uses of grep, awk, sed, cut and all those other text processing commands.
I know that it's not a direct answer to your question, but if that's one thing you're looking for...
I've been on the lookout for a high-level tracking tool that would allow us as SEO's to (i guess download raw log files) from client sites that are hosted on various servers (not our own, as we are not a development firm - & they usually hire us after the site is built/hosted) but - i have no idea what resources it would take for us to do that with as many as 25-50 clients at any given time.
I want something that will track search engine to search phrase (to a specific page) to sale, show visitor paths through the website and include bail-out points, accurately separate 'visitor sessions' from spiders, allow me to choose various time/date ranges, and generate easily readable reports to show clients via an extranet or something.
i'm sure there's more requirements I had, but it's late in the day, and that'll give you something to chew on for now.
Unfortunately, if there were an individual with such programming skills as to create such an intelligent system, many of us would be out of a job.
The system then would automagically adjust all our servicesto the indivials best and most desired needs, making whatever we sell irresistable. People would pay $20 for a single paper clip...
I CAN build it (i'm a senior web application programmer). But when brainstorming i've learned not to think of the actual technical implementation, it slows down the creativity ;-) I'm more interested in great ideas in this stage.
ggrot :
Great idea! I don't think you would be out of your job. It's a great tool for a great human mind. The first initial keywords should be chosen by the SEO. Keywords vistors used @ google the are not in the initial 'lists' are saved by the system & tried out. That can cause 'dirty' keywords (completely off-theme). The SEO has to monitor new keywords. If your website doesn't show up @ certain keywords that you don't have in your listing, the system can never find those new ones out (do you got an idea to accomplish this?).But it would be still a very handy tool.
Great idea's people! Could still use a lot more input. Dream on! ;-)
please include:
automatically next to every keyphrase logged:
1. Overture and Wordtracker # search stats per month of that keyphrase
2. SERP position within the search engine that gave you that referral for that specific search phrase.
and I aggree with Roland - make it live!
The problem with live statistics is load scaling. Imagine you are running amazon.com and every request is sent to a secondary server database to log the traffic. This is fine and dandy if your server(s) can handle the requests, which is reasonable to assumme.
But then assumme that you want to run a report that requires a summary of all the file requests in the database. You want a snapshot at an instant of time nearby the time you submitted the request. Thus, once you start processing the report request you have to lock the database for the duration of that request. Since your report may easily have to access the full range of values a few times(which would be billions of requests), it is logical to assumme this could take a few seconds during which no other processes would be able to access the database. If you interleave this process with updates, you will have a corrupt report.
This is not necessarily a problem by itself, if you have a good queueing system as most modern databases do, the logging requests can be handled after report generation has completed. The problem lies in what other things this database is used for. In amazon's case, it displays products on the site that you have browsed for before and related products. To do this, the scripts would be requesting data from the database and due to the backlog might be waiting 30-40 seconds before it can generate the page. You just lost 100 customers there in that minute.
There are ways around all of this, but it is generally costly in hardware terms.
For each new referrer add a new entry in a table in memory.
Each time you get a new visitor from this referrer, increment the counter.
At the end of some timescale (minute, hour, day, week, month, year)
save the result so you don't have to compute it a second time and scroll the graphics...
Real time warnings about errors, processor over load, spiders that are indexing etc etc are a different matter. I rather have an intelligent 'PUSH' system ;-) That minimizes my time spend
The live database uses a synch routine to update a secondary database. When a report is requested, the 2nd d/base locks out, and runs the report. When the report is done, the synch routine will be able to tell where the last update was, and synch the d/bases up again (maybe use a timestamp to control the synching?)