homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

How does googlebot visit?
Multiple threads? Cookies? Effect on session variables?

10+ Year Member

Msg#: 4726 posted 7:43 am on Aug 11, 2002 (gmt 0)

When real users visit our site, server session variables in ASP are set, and a cookie is written containing user options (language etc).

But what happens when the bot arrives?

Can the bot visit be considered to be a single "session"? or is each page index treated as if it is a new user visitng the site?
Does it use multiple threads? (ie open different pages in parallel)?
Does it store cookies?

Appreciate any clarification ...



WebmasterWorld Senior Member nick_w us a WebmasterWorld Top Contributor of All Time 10+ Year Member

Msg#: 4726 posted 7:55 am on Aug 11, 2002 (gmt 0)

Doesn't store cookies, os if your session id's are appended to the url be very carefull about potential duplicate content problems.



WebmasterWorld Senior Member 10+ Year Member

Msg#: 4726 posted 11:22 am on Aug 11, 2002 (gmt 0)

Each pagehit by the Googlebot is a new session.

Google has a whole bunch of IP's and the bot - at least in the case of my site - will hit with a few dozen (or more) different IP's over the course of an hour at rates exceeding 150 pages an hour (combined, not each).

The thing to make sure of is that your pages don't require any session information for them to be displayable. There should be valid default values for any variable in the code so that if it doesn't get the value it's looking for, it's capable of providing its own info to fill in the blank.

I've been thinking about writing my own little sub-script that gives the username "Googler" to the bot when she comes so that when people view my cache on Google it says, "Welcome Googler!" I just haven't quite gotten around to it, yet.



10+ Year Member

Msg#: 4726 posted 11:18 pm on Aug 11, 2002 (gmt 0)

This is an important issue you have raised, we have an ecommerce system that relies on Session IDs when cookies cannot be set - hence Google sees duplicate content from these pages.

We have hopefully found a way around it:

In [webmasterworld.com...] GoogleGuy has pointed out in a recent thread that they truncate URLs at the "?" -- if this is correct then make sure the PHPSESSID=xxxx follows a question mark - this is what we have done.

Problem is that I see no evidence that Google actually truncates URLs after the "?" :(

Anyone had any experience over a period of more than 2-3 months?

One of our sites was sliced down in the number of pages indexed (considerably) during the last update and this is possibly a reason why.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved