Forum Moderators: open
I've never optimised a dynamic site before but now I've been asked to analyse one that is dynamically generated. The pages are all jhtml and every link is associated with a session id. The number of scripts, flash and other yukky stuff is amazing. Now, because I am not a programmer I'd like to know why session ids are used for. I know I have to recommend them to drop them because Google can't index the site but I'd like to know why anyone would be doing it in the first place, what is it used for and if it's essential or not. If it is essential, then can a site be optimised and still keep session ids?
Thanks
Some people would mention that you could use a cookie to do the same thing and keep the urls much cleaner. That's true, but not every user has cookies enabled. Using session-id's is one way to try to guarantee that you know that state of a user, even if they don't allow cookies, for example.
So what's the problem with a session id, and why doesn't Googlebot crawl them? Well, we don't just have one machine for crawling. Instead, there are lots of bot machines fetching pages in parallel. For a really large site, it's easily possible to have many different machines at Google fetch a page from that site. The problem is that the web server would serve up a different session-id to each machine! That means that you'd get the exact same page multiple times--only the url would be different. It's things like that which keep some search engines from crawling dynamic pages, and especially pages with session-ids.
Google can do some smart stuff looking for duplicates, and sometimes inferring about the url parameters, but in general it's best to play it safe and avoid session-ids whenever you can. Hope it makes sense!
best wishes,
GoogleGuy