Forum Moderators: coopster
I'm doing a hits counter... basically tracks the hit, the referer and date.
The problem is that I want to show results not only for urls, but for domains only, so group for example all referers from google.com and add the hits they all produced.
I can't imagine a way other than populating a list of results and then parse them to build a list of domains and add the hits.
Is there a way to construct a mySQL query to take only the domain names from the urls in database addind the hits? may be I do ask too much for a query hehehe..
thanks.
SELECT * FROM tablename WHERE referrer RLIKE 'http:\/\/.*google\.[^\.]*\/.*'; This would match:
[google.com...]
[google.de...]
BUT NOT
[google.example.com...]
As, mykel79 stated, a regular expression is one option, another may be the MySQL SUBSTRING_INDEX [dev.mysql.com]function. Of course, don't forget to COUNT() [dev.mysql.com] your hits. I'm not sure how you are storing your information, but here is an example to get you started.
SELECT
SUBSTRING_INDEX(referer, '.', -2) AS site_referer,
COUNT(*) AS hits
FROM table
GROUP BY SUBSTRING_INDEX(referer, '.', -2)
;
I have the following table
id, page_id, unique_hits, raw_hits, referer, hits_date
The script is called by an img with the pageid and referer as params and it adds the refering page to the database.
What I want to do is to be able to show the stats in two ways, one the normal with hits stats and refering url, and the other with just hits and domain names.
So that way I do show how many hits the page has received from each domain, with a sum of the hits of all pages in a domain.
That's the thing I do not see clear if possible, because the query would need to return a list of domains and hits... thinking now, this is way too much for a query...
The sollution I do think now is to fisrt do a normal query, and then get a list of domains from the url of results and then do a new query for each domain name to get the amount of hits for that domain. But this is perhaps a lot of queries... say 20 results per page = 1 + 20 queries.
Any other ideas about doing this more efficiently?
Thanks all!
* Your original question asked for a way to split out the domain name. But another post sounds like you want the server or host name. So, do you want to differentiate between ww1.example.com and ww2.example.com, or do you want hits from both referrers lumped together?
I keep the full unchanged referer in it's own col then usually keep domain or ay other data I may want to analyze seperate as well.
I usually do it in multiple steps.
I start by storing raw referer data in a file or initial db.
I usually move it to permanent storage from there doing some frontline analysis that will speed up queries.
I do put some duplicate data in there but it is only for speed of queries. There will eventually be tons of data so I like to split it up a bit.
That leaves the front end tracking able to just dump the data and keep rolling for front end speed. The processing is usually done by cron which can take its time and do the number crunching and comparison. Then the reports can be run from the pre analyzed data.
You can break up the full referer into as many fields as you need maybe even into domain, tld, string or whatever else you need and store them for ease of reporting.
Good luck!
Sometime name1.site.com/... is a totally different sites from name2.site.com/... whenever the domain is the same... This situation occurs in some free hosts.
Perhaps I would have a query to extract just domain name and another for the host name. That way hits comming from name1.site.com/page1.htm and name1.site.com/page2.htm will get counted into name1.site.com also have the other query for just the domain name.
Here is a nice query for the domain name I found searching but not tested yet...
[experts-exchange.com...]
Also, right, I should at least remove the http:// part of the referers, as all are to be same (but bookmark and weird referer headers from software)
thanks for all suggestions!