When real users visit our site, server session variables in ASP are set, and a cookie is written containing user options (language etc).
But what happens when the bot arrives?
Can the bot visit be considered to be a single "session"? or is each page index treated as if it is a new user visitng the site? Does it use multiple threads? (ie open different pages in parallel)? Does it store cookies?
Google has a whole bunch of IP's and the bot - at least in the case of my site - will hit with a few dozen (or more) different IP's over the course of an hour at rates exceeding 150 pages an hour (combined, not each).
The thing to make sure of is that your pages don't require any session information for them to be displayable. There should be valid default values for any variable in the code so that if it doesn't get the value it's looking for, it's capable of providing its own info to fill in the blank.
I've been thinking about writing my own little sub-script that gives the username "Googler" to the bot when she comes so that when people view my cache on Google it says, "Welcome Googler!" I just haven't quite gotten around to it, yet.
This is an important issue you have raised, we have an ecommerce system that relies on Session IDs when cookies cannot be set - hence Google sees duplicate content from these pages.
We have hopefully found a way around it:
In [webmasterworld.com...] GoogleGuy has pointed out in a recent thread that they truncate URLs at the "?" -- if this is correct then make sure the PHPSESSID=xxxx follows a question mark - this is what we have done.
Problem is that I see no evidence that Google actually truncates URLs after the "?" :(
Anyone had any experience over a period of more than 2-3 months?
One of our sites was sliced down in the number of pages indexed (considerably) during the last update and this is possibly a reason why.