Forum Moderators: open

Message Too Old, No Replies

Appended session ID's

Which spiders do they kill?

         

KMxRetro

11:47 pm on Aug 6, 2002 (gmt 0)

10+ Year Member



Hi folks,
I was reading on another site that Googlebot doesn't like session id's that are appended to URL's.

Googlebot was only getting 2 or 3 pages maximum from my site (which has LOTS more pages). I changed my script so that if the referrer is Googelbot, the Session ID isn't appended and the next day, the bot picked up 26 pages.

My question is this. Which search engines do session id's affect? It would be very handy to compile a list so that I can remove the "SID" for them as well.

Thanks,

agerhart

12:38 pm on Aug 7, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The long dynamic URLs with query strings, like you have on your site, are not good for any search engine. While the search engines can spider these pages, I have always doubted the competing power of a page that has an address like this.

I would do all that you can to simplify your URLs.

KMxRetro

12:53 pm on Aug 7, 2002 (gmt 0)

10+ Year Member



I agree and always thought the same thing, but the majority of my pages (the ones that need to be indexed, that is) only have one variable attached to the end of the URL.

I thought that the added Session ID may be tripping poor little Googlebot up, so I coded in a piece that says "If the agent is Googlebot, don't add Session ID's" and guess what?

Here's the last week of Googlebot activity on my site...

July 31st - Nothing from Googlebot
Aug 1st - Nothing from Googlebot
Aug 2nd - 2 pages spidered
Aug 3rd - 3 pages spidered
Aug 4th - 1 page spidered
-------I removed session ID's on Aug 5th (late evening)--------
Aug 5th - 26 pages spidered
Aug 6th - 851 pages spidered
Aug 7th - Still going according to IP logs...

Now surely that isn't a coincidence! It even got my pages that have multiple variables after the URL.

Could it be that Googlebot doesn't actually mind query strings, just session ID's?

Forgot to add that I have never had more than 6 pages spidered at once from Google (in 6 months) until now.

KMxRetro

9:53 am on Aug 8, 2002 (gmt 0)

10+ Year Member



Aug 7th shows 636 pages caught by Googlebot.

And now its done. I'm sorry, but it MUST be those session ID's. :)

starec

10:18 am on Aug 8, 2002 (gmt 0)

10+ Year Member



KMxRetro

I highly recommend to do the same with Fast/alltheweb spider.

Had to remove SID because it was fetching the same page again and again with different SIDs...

KMxRetro

10:35 am on Aug 8, 2002 (gmt 0)

10+ Year Member



Its funny that you say that Starec. Whilst looking through my log files this morning, I noticed that yesterday, FAST/All The Web picked up 100+ pages, even though it's only ever picked up 2 or 3 before.

I'll take the SID's off and see if it improves even more.

Do you know what the agent name is for FAST?