homepage Welcome to WebmasterWorld Guest from 184.73.87.85
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Website
Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Sub-domain & crawl-delay
foxfox




msg:3760567
 3:14 pm on Oct 7, 2008 (gmt 0)

Yahoo! Slurp is being well respected to follow the crawl-delay so it will not overload my server.

However, I found the crawl-delay seems to be set per domain, so if I have a site with 50K subdomain hosting on a single server, crawl-delay = 1 is useless, seem they can fetch 50K requests per second.

I belive they know as the 50K subdomain are on a single IP.

I want to ask, is it possible to limit the crawl-rate by server IP, rather than domain / subdomain.

How do you solve it if you have many subdomains?

 

jdMorgan




msg:3760614
 4:39 pm on Oct 7, 2008 (gmt 0)

Since robots.txt is a 'per-(sub)domain' file, each is treated separately at one level -- The per-site URL-allow/disallow processing. But you're right, they should have a back-end 'association' process that limits the rate per IP address/hardware server.

It may be that it takes some time to associate all the domains and subdomains. Has any change taken place recently, such as a new IP address, or more subdomains added to your 'collection' on the single IP address?

Ultimately, the decision of how many (sub)domains to host on a server should take crawling into account. Fifty thousand is a very high number -- about 125 times higher than a 'normal' shared hosting maximum for medium- to low-traffic sites. So, you might consider setting each crawl-delay to at least 120 if you really intend to host that many sites on one server.

Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved