|jsessionid = Bad SEO?|
Should they be removed?
I have a client that has jsessionid's in their URL string and figured that can't be good! Visions of duplicate content and penalties came to mind. Searching Webmaster world seemed to confirm this:
However looking on the web there are over 27 million pages that also have these session ids:
about 27,600,000 for inurl:jsessionid
Seems to me that Google would take this into consideration and just filter out the results.
The questions are: Do you think it is worth the time and effort to push the client to somehow change this? What kind of return on their investment would they see? Any unknown issues that can cause problems with the site and/or indexed pages?
Yep, session ids are always bad. They can't filter it. The problem is the bots just keep indexing those pages with new sessionids. Nothing they can do to "strip" or take into account those urls.
>to push the client to somehow change this?
yes - without question. Get them to change it, or let the client go.
> What kind of return on their investment would they see?
Too many variables to know for certain, but it could mean major differences in rankings. They can do what they are doing in cookies instead.
|Seems to me that Google would take this into consideration and just filter out the results. |
|looking on the web there are over 27 million pages that also have these session ids |
Didn't you answer your own question? :)
Seriously though, session ids in the URL are one of the worse things a site can have. Even if Google is ever smart enough to ignore them, stupider spiders will hammer away at the site, costing the site owner in unnecessary bandwidth.
Even a mediocre programmer should be able to get session ids out of the URL very easily -- make sure they do so.
I have this same problem. I don't consider myself a mediocre programmer, but in the environments I work it it is really hard to get id of the session ID's.
It would be even worse if my app needed the session cookie/id (which it does not). Java struts does not provide an easy way to nuke this 100%.
You can partially solve this problem by adding <% @page session="false" %> to the top of each struts tile.
Get session ids out!
If necessary, use your httpd.conf file and RewriteRule the sessions ids out if the visitor is a bot, like googlebot.
Easy to do, you can have the fix in place wihtin 1 day, and you will save yourself lots of grief. If it is already too late, then by fixing this you will remove duplicate content, and you will get better rankings back eventually.
Fix it NOW! Stop thinking about it and just do it!
As I said, it is not so simple. Apache is just used as a front end to Tomcat over AJP. You can find a solution though, by Googling for "Java’s SEO blunder: jsessionid". That combined with a mod_rewrite can purge most legacy requests of the stupid, stupid, JSESSIONID.
[edited by: tedster at 7:25 am (utc) on Feb. 21, 2007]