Forum Moderators: coopster
So far, this has worked well. To find the number of page impressions, I have the program count the number of IP addresses in the counter file. To find the number of visitors, I go by unique IP addresses.
I've decided that I would like more functionality, like being able to pull up certain dates, display graphics showing traffic trends, etc.. so I've converted all of my flatfiles to MySQL. The new database currently has over 100,000 records, and will grow by one record for each page view of the site. With this method, I can now create about any script that gives me the information that I want.
Unfortunately, a simply lookup seems to take longer than I'd want it to. For instance, if I want to know the number of impressions for today, I would use this query:
SELECT * FROM Table WHERE Date = '$Date';
This takes about one second, which I can live with (can't use COUNT(), you'll see why in a minute, but performance-wise it doesn't seem to matter anyway). But if I want to know the number of impressions from $StartDate through $EndDate broken out by day, I have to make multiple queries, one for each day. So if I wanted to find the amount of traffic from day 1 through day 30 of a given month, it would take about 30 seconds to retrieve this data. It would also take about that amount of time to retrieve this data to display a graph showing traffic trends throughout the month.. and it would take a considerably longer period of time to generate a graph that would show traffic over a year or more. Then it gets more interesting when you find out that I have to compare each record's user agent to make sure that it's a real visitor (I don't count bots in my traffic).
Is this normal, or is there some magic query that I'm not thinking of? It doesn't seem that the number of records returned has anything to do with it, because I used Date and Time to narrow my search down to a single record, and the return time 1.10 seconds.
This being the first index that I've created (other than a primary key), what disadvantages does having an index cause? If I wanted to search by other fields as well (i.e. user agent), would it be a good idea to create indexes using all of those fields? Even though this would increase my search performance, will it cause other performance hits elsewhere, or take considerably more space?
While each site visitor will only have a tiny degradation of performance in return for your great increase in efficiency, IMHO it does not seem like a good tradeoff to decrease the performance of site visitors to increase the performance of admin tools.
After a quick test, it appears that a particular insert operation was timed by MySQL as taking 0.00 seconds, so it doesn't seem to increase overhead that much. An update test took the same amount of time.
<While each site visitor will only have a tiny degradation of performance in return for your great increase in efficiency, IMHO it does not seem like a good tradeoff to decrease the performance of site visitors to increase the performance of admin tools.>
I agree.. but it doesn't look like there will be any noticable degradation for inserts or updates, so I think I'll be okay.
Yes. You can have any number of indexes and it's best to have indexes for the types of queries you commonly perform. Check out my post in this thread [webmasterworld.com] for more on that.
>> on the negative side it increases the overhead on inserts and updates
That's true. So what would be the best way to build an app like this without losing performance on either end? According to the mysql manual LOAD DATA INFILE is the fastest way to insert, so I would guess that logging to a flat file and periodically importing it to the database would be the best bet. Any other ideas?