Forum Moderators: open
I implemented a mod on my forum to remove the session ID when googlebot visits - I'm going to add Ink to that mod!
Thanks for bringing it up ;)
I changed my robots to keep them away from a section with Session Ids and they seem to be behaving better now.
Looks like they're also getting better at dynamic URLs from what I can tell.
my two bits,
kpaul
Can someone explain to me about session id's? What are they and how can they help me fix this problem?
Also below i have cut and paste some of my logs to show you the inktomi bot and other bots and how much bandwidth they have used in the last 21 days.
Inktomi Slurp
Hits - 37101
Bandwidth - 1016.33 MB
Date - 21 Aug 2003 - 14:14
Unknown robot (identified by 'crawl')
Hits - 2130
Bandwidth - 23.12 MB
Date - 21 Aug 2003 - 11:15
Googlebot (Google)
Hits - 1497
Bandwidth - 84.86 MB 21
Date - Aug 2003 - 03:17
Scooter (AltaVista)
Hits - 875
Bandwidth - 39.79 MB 21
Date - Aug 2003 - 12:50
WISENutbot (Looksmart)
Hits - 469
Bandwidth - 11.90 MB
Date - 21 Aug 2003 - 13:35
Jeeves
Hits - 396
Bandwidth - 21.19 MB
Date - 21 Aug 2003 - 12:36
Alexa (IA Archiver)
Hits - 372
Bandwidth - 15.96 MB
Date - 20 Aug 2003 - 08:21
Fast-Webcrawler (AllTheWeb)
Hits - 120
Bandwidth - 7.12 MB
Date - 16 Aug 2003 - 21:27
Total bandwidth for all of the top crawling bots on my site = 1220.27 in the last 21 days.
Thats over 1 GB
Session ID's are used like cookies to store your users details like if they're logged in to the forum etc. It produces URL's to the server like:-
www.forum.com/message12345&SID=1234354376767656
Each time a bot visits (same as a human) it gets a new "session" started and a new session ID (SID).
So the bot thinks it's a new link as the URL's are unique (thanks to the SID) and ends up taking 1,000's of hits on the same URL.
Does that make sense or should I explain further?
TJ
It makes sense but how can i stop it while using vbulletin?
Search for "vbulletin" and "SID" and "search engine".
There's a bit of info around for it, but to be honest I don't think there's an easy fix other than don't use SID's. That means users must have cookies enabled to work the site properly.
I assume WebmasterWorld uses cookies only?
TJ