Forum Moderators: phranque

Message Too Old, No Replies

PHP and Robots.

PHPSESSID woes...

         

Gibble

6:40 pm on Oct 4, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well my site tracks visitors using PHP sessions. The webserver is configured so that cookies aren't needed and it automatically appends?PHPSESSID=somegibberishhere to urls. This is great for the most part when real people look at the site, but when a spider crawls the site I get duplicate pages (I have one running for an internal search engine that's why I noticed)

I end up with a list of pages like
[somedomain.com...]
[somedomain.com...]
[somedomain.com...]
[somedomain.com...]
etc etc

Is there a way to set a rule in robots.txt to stop it from spidering pages that have PHPSESSID on the query string, while allowing all the other pages to still be spidered?

I'm concerned this could be causing my site to not be spidered as most bots seem to grab a few pages see this, then not return for a month only to have the same thing happen again.

Thanks for the help

ps. I didn't know which forum to put this in, so please move it if need be.

Gibble

8:19 pm on Oct 4, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Oh brother.

I finally fixed the problems, removed the PHPSESSID that was being added automatically since googlebot wouldn't crall pages with that. I handled sessions a different way, ran some tests, it appears good now, but I fear I missed googlebot's crawl for this month.