Forum Moderators: open

Message Too Old, No Replies

problem with JSESSIONIDs and robot crawlers

JSESSIONIDs combined with the use of Struts with Hibernate

         

KBee

10:34 pm on Aug 17, 2006 (gmt 0)

10+ Year Member



Hi,

I'm having a problem with this. Any ideas?

Our application uses Tomcat and Struts with Hibernate to deliver dynamic web pages. The JSP pages use the c:url tag to automatically append JSESSIONID to links if the client cookies are disabled. For SEO purposes, we’ve tried several approaches to handling JSESSIONIDs so that search engines do not get incorrectly weight pages based on the transient JSESSIONID values. Ultimately, we’ve had to remove all the c:url tags so that the JSESSIONIDs are not generated and add cloaking to our Apache webserver to strip the JSESSIONID from incoming links. However, this has turned out to be a nightmare. The problem is that a bot like Google can visits our website over a thousand times in an hour. Add this to other bots, and we’re in a situation where search bots account for a significant fraction of the total traffic to our website. That’s normally fine; however, when we remove the JSESSIONIDs from the links, a robot crawler jumping from page to page on our website will be interpreted as a new unique user. Every new unique user is automatically assigned a session object by the servlet container. In addition, the use of Struts with Hibernate make it a requirement to have a session object available (since hibernate uses a unique session id to track persistent connections). Our system setup combined with removing the JSESSIONIDs from all urls on the website overwhelms our servlet container because new session objects are spawned every single time a bot hits our website. The only way to handle this problem, is to put the session-timout setting in the web.xml file down to a very small value (e.g. 1 minute), but then this creates the additional problem of our users inadvertently being logged out after a minute of inactivity. There seems to be no solutions to this problem. If I cater to the search bots, I screw the users. And vice versa. Any suggestions?

KBee

2:59 am on Aug 23, 2006 (gmt 0)

10+ Year Member



Hey does anyone have any ideas about this? I'm getting nowhere with it.

GaryK

5:17 pm on Aug 23, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I hate to let a message go unanswered. However in this case I'm not sure you're asking this question in the correct forum.

We're primarily focused on identifying search engine spiders here, not providing technical support.

You might have better luck with your question in one of the more technology-related forums here on WebmasterWorld.

Best of luck to you. :)

volatilegx

2:49 am on Aug 24, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't know anything about Struts and Hibernate, but unless you can custom program them to recognize spiders and not assign them session IDs, I think you are going to have to rebuild from the ground up with spiders in mind. It really sounds like a serious problem to me. Have you brought it to the attention of the developers of the packages you are using?

incrediBILL

6:31 pm on Aug 25, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



KBEE,

Yhe only valid reason I can even think of for adding JSESSIONIDs to the URL would be to allow someone with cookies disabled to use a cart in ecommerce.

If this is in fact the case, which I've implemented in the past, the solution is simple.

Do the following:

1) You only enable JSESSIONIDs when someone that has cookied disabled adds a product to the cart.

2) You disallow all bots from crawling the cart page and checkout process therefore no JSESSIONIDs will ever be displayed to the bots.

If this isn't the case, and your site is being monetized via something like AdSense, the JSESSIONIDs will completely mess up your ad targetting.