Welcome to WebmasterWorld Guest from 54.159.19.75

Forum Moderators: open

Message Too Old, No Replies

How does googlebot visit?

Multiple threads? Cookies? Effect on session variables?

     
7:43 am on Aug 11, 2002 (gmt 0)

Junior Member

10+ Year Member

joined:May 18, 2002
posts:101
votes: 0


When real users visit our site, server session variables in ASP are set, and a cookie is written containing user options (language etc).

But what happens when the bot arrives?

Can the bot visit be considered to be a single "session"? or is each page index treated as if it is a new user visitng the site?
Does it use multiple threads? (ie open different pages in parallel)?
Does it store cookies?

Appreciate any clarification ...

7:55 am on Aug 11, 2002 (gmt 0)

Senior Member

WebmasterWorld Senior Member nick_w is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Feb 4, 2002
posts:5044
votes: 0


Doesn't store cookies, os if your session id's are appended to the url be very carefull about potential duplicate content problems.

Nick

11:22 am on Aug 11, 2002 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 13, 2002
posts:676
votes: 0


Each pagehit by the Googlebot is a new session.

Google has a whole bunch of IP's and the bot - at least in the case of my site - will hit with a few dozen (or more) different IP's over the course of an hour at rates exceeding 150 pages an hour (combined, not each).

The thing to make sure of is that your pages don't require any session information for them to be displayable. There should be valid default values for any variable in the code so that if it doesn't get the value it's looking for, it's capable of providing its own info to fill in the blank.

I've been thinking about writing my own little sub-script that gives the username "Googler" to the bot when she comes so that when people view my cache on Google it says, "Welcome Googler!" I just haven't quite gotten around to it, yet.

G.

11:18 pm on Aug 11, 2002 (gmt 0)

New User

10+ Year Member

joined:Aug 10, 2002
posts:32
votes: 0


This is an important issue you have raised, we have an ecommerce system that relies on Session IDs when cookies cannot be set - hence Google sees duplicate content from these pages.

We have hopefully found a way around it:

In [webmasterworld.com...] GoogleGuy has pointed out in a recent thread that they truncate URLs at the "?" -- if this is correct then make sure the PHPSESSID=xxxx follows a question mark - this is what we have done.

Problem is that I see no evidence that Google actually truncates URLs after the "?" :(

Anyone had any experience over a period of more than 2-3 months?

One of our sites was sliced down in the number of pages indexed (considerably) during the last update and this is possibly a reason why.