Forum Moderators: phranque
Google, Alexa and Exalead are each crawling each and every page of my site dozens of times each day. Good you say? Bad, because they're getting pages that differ ONLY in the URL:
93.47.80.51 - - [28/Nov/2006:16:05:36 -0800] "GET
/awards.do;jsessionid=68B86DFF8E4A8597B210531C3431965D HTTP/1.1" 200
17195 "-" "Exabot/3.0"
193.47.80.51 - - [28/Nov/2006:16:17:30 -0800] "GET
/awards.do;jsessionid=0621414681C92E1A00A9428A7800AC30 HTTP/1.1" 200
17195 "-" "Exabot/3.0"
193.47.80.51 - - [28/Nov/2006:17:00:36 -0800] "GET
/awards.do;jsessionid=0079FCD91ED8E5B86902228D285CCEEF HTTP/1.1" 200
17195 "-" "Exabot/3.0"
193.47.80.51 - - [28/Nov/2006:20:41:50 -0800] "GET
/awards.do;jsessionid=DE9B61384D3D75DE9EB38A21F066E433 HTTP/1.1" 200
17195 "-" "Exabot/3.0"
So folks: watch out for this. You may have to move away from Tomcat if you want to solve this issue.
[edited by: txbakers at 11:39 am (utc) on Dec. 4, 2006]
[edit reason] examplified URL [/edit]