Forum Moderators: coopster

Message Too Old, No Replies

Creating Counter That Doesn't Included Search Engines

How to see if the page view is a SE or not?

         

wfernley

2:09 pm on Sep 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hey everyone, I was curious if this is possible. I want to create a counter to see how many people are viewing a certain page but i don't want to count Search Engines. Is this possible?

Could I setup something like this?

if (!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Googlebot') &&!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'slurp@inktomi.com;') &&!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'ZyBorg') &&!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'FAST-WebCrawler') &&!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Gigabot') &&!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'Scrubby') &&!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'msnbot') &&!strstr($HTTP_SERVER_VARS['HTTP_USER_AGENT'] ,'YahooSeeker'))

If anyone can help that would be great :)

Thanks in advance for your help!

Wes

dcrombie

3:22 pm on Sep 2, 2005 (gmt 0)



It's never going to be possible to filter out all search engines - there are literally hundreds of them out there.

A good start would be to block the major ones:

  • Googlebot
  • msnbot
  • Slurp
  • Ask Jeeves

    Then any that are specific to your region or genre (check your raw logs):

  • webwombat
  • ConveraCrawler
  • grub-client

    and so on.

    A more 'compact' version of your code would use in_array [php.net] and the $_SERVER['HTTP_USER_AGENT'] variable ($HTTP_SERVER_VARS is now deprecated).

  • wfernley

    3:35 pm on Sep 2, 2005 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Yeah I can understand I wouldn't be able to filter all SE's out but it would at least give me a closer number of page views than if SE's we included.

    Thanks for the array function that would make the code a lot simpler.

    Stooshie

    3:57 pm on Sep 2, 2005 (gmt 0)

    10+ Year Member



    An easier way may be to assess the browser from the user agent, and if it is not in:

    opera
    staroffice
    webtv
    beonex
    chimera
    netpositive
    phoenix
    firefox
    safari
    skipstone
    msie
    netscape
    mozilla/5.0

    then assume it is a search engine.

    dcrombie

    4:35 pm on Sep 2, 2005 (gmt 0)



    That method doesn't really work any more:

    Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) 
    Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]
    YahooSeeker/1.2 (compatible; Mozilla 4.0; MSIE 5.5; yahooseeker at yahoo-inc dot com)

    nickdgd

    6:09 am on Sep 5, 2005 (gmt 0)

    10+ Year Member



    You can use javascript to trigger a counter page, since search engines doesn't execute javascript while browsers does.

    The javascript should be like this:

    <script type="text/javascript" src="counter.php"></script>

    where counter.php is the counter page. The counter.php can output nothing(contain no javascript at all), but you can use counter.php to output the count number through such a line:

    echo "document.write('" . $count . "')";

    If you're familiar with DOM, you can also rewrite the line above in a DOM compliant form.