Forum Moderators: open

Message Too Old, No Replies

Getting rid of Session IDs

(without getting busted)

         

menyak

8:40 pm on Nov 3, 2002 (gmt 0)

10+ Year Member



I am using phpBB, and Google refuses to spider it, most likely because phpBB adds session IDs to the URL if a user doesn't have cookies enabled (Googlebot doesn't ;) ).

Now probably the easiest way to solve this problem is to serve pages without session ID to the Googlebot. Only problem: There's no point in denying that this is defacto cloaking.

Can this get me into (severe) trouble? As far as I see, the page content would remain 100% unaltered, but the URL would change. And (more worrying), if Google sent a control bot disguised as, let's say, Mozilla, it wouldn't be able to find the links that Googlebot sees. Google might conclude that someone's messing with their Googlebot, and get VERY, VERY angry...

Nick_W

8:50 pm on Nov 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've used PHPBB before. Google just loved it. You may have an entirely different problem...

Nick

menyak

9:09 pm on Nov 3, 2002 (gmt 0)

10+ Year Member



Nick, you probably used phpBB 1.4 or earlier. You're right, phpBB and Google had a great romance, but that ended when Google got jealous on the session ID introduced in phpBB 2.

Asandir

9:09 pm on Nov 3, 2002 (gmt 0)

10+ Year Member



www.phpbb.com/phpBB/viewtopic.php?t=43552&highlight=spider+google+session

www.phpbb.com/phpBB/viewtopic.php?t=32328&highlight=spider+google+session

Try those. I've had to "adjust" all my sites so that a session is not started until necessary - so Google can spider without a problem (should it be one, which I'm not sure).

The the very-recent Google update, the deep crawler hit my (new) main site, took the robots.txt, the index, and left. I'm a little worried at the moment.

However, two and a half months ago, the fresh bot spidered the whole site.

*sigh*

[edited by: engine at 1:19 am (utc) on Nov. 4, 2002]
[edit reason] delinked [/edit]

Nick_W

9:16 pm on Nov 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



menyak, yes it was 1.4 though I'm still unsure as to whether this is really a problem...

But, for what it's worth. Most of my sites only start a session/set a cookie if the UA is not a bot.

I've not had any trouble in the year and a half or so I've been doing this...

Nick

menyak

9:18 pm on Nov 3, 2002 (gmt 0)

10+ Year Member



Thanks for the links - that's in fact where I came from. ;)

The same thread also recommends to set up a hallway page that lists the topics in a forum. There's an example page posted - it is now a PR0 graveyard...

So I'm a little bit worried about the second approach as well. To me, serving a different page to the Googlebot is a bit like dancing on a volcano. The mere fact that it was fun yesterday doesn't mean that it can't kill you tomorrow. Or am I exaggerating?

WebGuerrilla

9:48 pm on Nov 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




Serving pages that have customized urls that enable Google to index the content does not fall under the topic of "Cloaking." You are not serving Google different content than what a human sees. You are just delivering that content from a differnt location.

A better solution would be to omit session ID's based on IP rather than UA. All search engines run undeclared bots from time to time. If you base your system on IP rather than UA it will work better.

Nick_W

9:48 pm on Nov 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Damned if you do, damned if you don't?

I'd just go ahead and do it. The best of a bad situation...

Nick

starec

9:54 pm on Nov 3, 2002 (gmt 0)

10+ Year Member



I had a session id related problem (though completely different from yours) some time ago. To my "cloaking" worries, WebGuerilla wisely replied:

Don't think of it as cloaking. I can't imagine any search engine (even the anti-cloaking zealot, Google)having any problem with that approach. You are delivering the same page to both spider and human. You would be doing them a big favor.

There are legit uses of cloaking, and making urls understandable to bots surely is one of them.

menyak

10:08 pm on Nov 3, 2002 (gmt 0)

10+ Year Member



Thanks guys! (No sleepless night today... ;))