Forum Moderators: DixonJones

Message Too Old, No Replies

banning all non-human users from accessing a script

         

stef25

3:20 pm on May 8, 2008 (gmt 0)

10+ Year Member



i have a script that runs on every page, the purpose of which is to track the pages that the user of my social network are viewing. it uses a unique hash and cookie mechanism to track users. very quickly my db tables are getting filled up with different hashes from the same ip address, which according to my script means that those ip's arent accepting cookies. when i examine the ip's i find out they are search engines, bots etc.

i already have a small list of user agents strings that get banned from accessing this script but now im wondering if there is a better way to ban these kind of accesses, other than filtering on user agent string.

basically – how do i stop everything BUT normal users from accessing this script? ive found this link below but im not sure if this will achieve what i want?

or, should i just get a list of user agent strings of all common browsers and only let those access the script? or do rogue bots also pretend to be accessing with a normal firefox user agent string?

[modem-help.freeserve.co.uk...]

jdMorgan

3:42 pm on May 8, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> which according to my script means that those ip's arent accepting cookies.

You posted the answer --or part of it-- right there. Set a (different if necessary) cookie on the click-path to the script-calling page itself, then have the script check the cookie. If it's not set, don't log the access in the script. The only complication is that this cookie must be set before requesting the page with the script call -- You may have to use an interstitial page, or --if no other method is workable-- a meta refresh.

You can also use a combination of IP address, request-header, and user-agent whitelisting and blacklisting, but it's an on-going chore. If the above cookie method reduces your junk-access logging sufficiently, you might be happier just to leave it at that.

The script should be "included" as a local file, and not directly-accessible via HTTP.

Jim

stef25

3:54 pm on May 8, 2008 (gmt 0)

10+ Year Member



second cookie - good idea.

dont quite get the click-path thing but im pretty sure i can handle it this way.

thanks v much

jdMorgan

4:00 pm on May 8, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, I wasn't very clear: The cookie tested by the script must be set before the HTTP request for the page that includes the script; Since cookies are strings sent in an HTTP request header by the client browser to the host, your script won't "see" the cookie unless it has been set before the browser requests the page with the script include on it.

Jim

stef25

7:57 am on May 9, 2008 (gmt 0)

10+ Year Member



The cookie tested by the script must be set before the HTTP request for the page that includes the script

could you explain that a little more? i thought of just setting a cookie called "botTest" at the top of the script. if user have it present then i run the rest of the script and if they dont that means they have cookies off or they are a bot and the hashes dont get assigned.

good enough right?