homepage Welcome to WebmasterWorld Guest from 54.211.238.24
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Visit PubCon.com
Home / Forums Index / Advertising / Paid Inclusion Engines and Topics
Forum Library, Charter, Moderators: Brett Tabke

Paid Inclusion Engines and Topics Forum

  posting off  
inktomisearch.com creating a lot of traffic
inktomi spider
montenegro




msg:17057
 9:30 am on Nov 2, 2003 (gmt 0)

Inkotomisearch.com is hammering my site. It created 98MB of traffic in the last 24 hours and 1GB during the last ten days of October. I will be paying for extra bandwidth soon. What is happening?

 

BlueSky




msg:17058
 10:21 am on Nov 2, 2003 (gmt 0)

Are you using session ids in your urls? If so, Slurp often gets lost and accesses the same pages repeatedly as the id changes. If that's what's happening to you, just ban him temporarily to make him stop.

If noone has said it already, Welcome to WebmasterWorld.

montenegro




msg:17059
 10:52 am on Nov 2, 2003 (gmt 0)

Thank you BlueSky. My site consists of about 50 static pages and they all link to OsCommerce shopping catalog (www.mydomain.com/catalog/index.php).This catalog contains aprox. 100 pages and yes I believe session ids are used in catalog urls. What would be the best solution for me:
1. To ban Slurp temporarily from visiting www.mydomain.com
2. To ban Slurp permanently from visiting www.mydomain.com/catalog/index.php
3. Something else
Another question is if I ban Slurp will it come back?

BlueSky




msg:17060
 1:30 pm on Nov 2, 2003 (gmt 0)

You just need to get him out of your catalog temporarily so he stops eating up bandwidth going around in circles. Once you get him to stop, you can let him back in after you get rid of those SIDs. If you cannot turn them off for bots in the control panel and no one here knows how to do it, then do a search or ask for instructions over at OSCommerce's site. It should be a pretty simple change to make the script check for known bots and serve them pages without any SIDs.

Not sure how often Slurp updates robots.txt, but you can try putting this in that file:

User-agent: *
Disallow: /catalog/

If you have other disallowed directories/files, just add them to this. Might as well keep all bots out of there until you turn off the SIDs. Let me know if he stops with that. If not, what kind of server is your site on -- Apache or something else?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Advertising / Paid Inclusion Engines and Topics
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved