Forum Moderators: phranque
What I can't seem to find an answer to is what to do about the URLs in the resulting HTML page? Our web servers (Tomcat 5) will add jsessionid to every URL within a page whenever cookies aren't supported. So even if we rewrite incoming URLs to strip out the jsessionid, won't the URLs in the requested page still include a jsessionid that changes whenever a bot comes a crawling?
Google seems to be clever enough to deal with jsessionid as I can't find more than a handful of URLs with it in the Google DB. However, Yahoo has picked up thousands of copies of the same page all with different jsessionids in the URL.
Thanks!