Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

How to hide session IDs from Googlebot using DB

How to prevent googlebot indexing pages with session IDs

         

morseu_s

12:16 pm on Jul 29, 2005 (gmt 0)

10+ Year Member



I found only one similar thread on webmasterworld but it doesn't cover my method :D

I primarily use sessions in cookies. Googlebot does not use cookies at all. In this case googlebot would index my pages with session IDs.

Therefore I use this solution :D

Sessions are disabled at all - I do not call session_start() by default

If visitor logs in (fills login form and POST it), I save his IP address into a database :D

At the beginning of every page is a code which checks visitor's IP. If it is in DB, then I use session_start()

I suppose google bot will never POST a form and never log in :D

And of course I delete from DB IP addresses older than 15 mins

Tell me your opinion :-]
Actually I have 12k visitors/day (gaming site) and the CPU working load is maximum 20% (peak in 5min period average) so i dont see any problem there. In addition, this method can't be recognized as cloaking.

servers specs : athlon 1900Mhz I guess, 768MB RAM, FreeBSD, php-a, apache@600 processes/threads max.

ThomasB

4:06 pm on Jul 29, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



morseu_s, that's a pretty widespread method across e-commerce sites (shops), but if you developed it yourself, quite clever. :)

phpmaven

2:25 am on Aug 1, 2005 (gmt 0)

10+ Year Member



Well... That method will work for those with a fixed IP, but many people have setups that result in a constantly changing IP. AOL users for example. You would be much better off trying to detect robots via their UA and or IP and then not start a session for them.