Forum Moderators: coopster

Message Too Old, No Replies

Exclude bots/adsense/search engines from making a query

         

Sandro87

3:14 pm on Feb 12, 2010 (gmt 0)

10+ Year Member



Hello,
I want to count visits to a page, how can I exclude bots and other things like that? For example if there's ads from adsense when I visit the page 2 visits are counted.
As suggested to me on a previous old topic here I put a tag like <img src="countvisits.php?id=1" /> in which the page makes the query and outputs a jpeg header + jpeg file and the browser should cache it ?
Is this supposed to be cached?
The query is done every time :(.
I was wondering indeed if there was a method to exclude refreshes and multiple visits without using a db to check the IPs.


Also, Is there a condition in PHP that I could use to exclude something is not a person? Does bots/engines use a desktop browser?

DailyAmerican

10:57 pm on Feb 12, 2010 (gmt 0)

10+ Year Member



Why not use google analytics?

Sandro87

11:04 pm on Feb 12, 2010 (gmt 0)

10+ Year Member



I need to count hits for contents on my site to save in the DB

DailyAmerican

11:35 pm on Feb 12, 2010 (gmt 0)

10+ Year Member



$botlist = array("Teoma","alexa","froogle","inktomi","looksmart", "URL_Spider_SQL","Firefly","NationalDirectory","Ask Jeeves","TECNOSEEK","InfoSeek","WebFindBot", "girafabot","crawler","www.galaxy.com", "Googlebot","Scooter","Slurp","appie", "FAST","WebBug","Spade","ZyBorg","rabaz");

if(ereg($bot, $HTTP_USER_AGENT)){} else { add hit }


For the user portion why not set a cookie. Then if they're a returning visitor you can tell using the cookie. You could probably then reduce the query time by passing vars from the cookie into the query.

[edited by: jatar_k at 2:16 pm (utc) on Feb 15, 2010]
[edit reason] fixed sidescroll [/edit]

Sandro87

12:01 am on Feb 13, 2010 (gmt 0)

10+ Year Member



is that an official list?

the problem with the cookie is that if the visitor visits various items = various ID items to exclude I would end up making several cookie or a cookie too long.
Maybe the IP control is the only way to go?
Do you know anything about image caching?

DailyAmerican

2:06 pm on Feb 15, 2010 (gmt 0)

10+ Year Member



That's not a full list, but it should be enough to get you started.

IP control might be the only way, but at the same time one IP that your counting could be used by multiple people in a company such a mine. Where the outside world only sees the one IP, but we have sub IPs. So, if you count an IP as a content hit then you may be miscalculating how many times your content has been viewed.

I would set some analytics software such as google's analytics on your site. Then compare the content's views from google to what you are getting. If the results are to far apart, then stick with your system. If the numbers are way off then you may want to look into using google's analytics. You can have it send you an e-mail of a csv file of the hits on your content. Then use PHP to parse the csv file and import that into your db as an alternative.

Not sure about the image caching. I know nielsen uses images in it's analytics scripting for mobile sites. This way is they see that the image is cached then they know that it's a returning visitor.

I hope that helps a little.

Sandro87

10:00 pm on Feb 15, 2010 (gmt 0)

10+ Year Member



Thanks it helped :)

rocknbil

12:42 am on Feb 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



if(ereg($bot, $HTTP_USER_AGENT)){}....


Soon to be extinct, use preg_match instead. :-)

Sandro87

11:46 pm on Feb 16, 2010 (gmt 0)

10+ Year Member



:)

could I use some kind of other identifier along with the IP? Anyway that hits counting it's just for 30 minutes then the same ip would count another hit. The problem with this could be the server load i guess...add queries only for the ip check and hits!

DailyAmerican

5:43 pm on Feb 17, 2010 (gmt 0)

10+ Year Member



To do that you'd end up having to set a timestamp in your db along with the IP. Then use php time() to compare it to the timestamp from the db. You will have to use mk_time() to convert the timestamp over to a comparable format to php's time(). Make sure to make your IP column either unique or primary key in your db.