Welcome to WebmasterWorld Guest from 54.82.57.154

Forum Moderators: goodroi

Message Too Old, No Replies

Sub-domain & crawl-delay

     
3:14 pm on Oct 7, 2008 (gmt 0)

Junior Member

10+ Year Member

joined:Dec 2, 2006
posts: 128
votes: 0


Yahoo! Slurp is being well respected to follow the crawl-delay so it will not overload my server.

However, I found the crawl-delay seems to be set per domain, so if I have a site with 50K subdomain hosting on a single server, crawl-delay = 1 is useless, seem they can fetch 50K requests per second.

I belive they know as the 50K subdomain are on a single IP.

I want to ask, is it possible to limit the crawl-rate by server IP, rather than domain / subdomain.

How do you solve it if you have many subdomains?

4:39 pm on Oct 7, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


Since robots.txt is a 'per-(sub)domain' file, each is treated separately at one level -- The per-site URL-allow/disallow processing. But you're right, they should have a back-end 'association' process that limits the rate per IP address/hardware server.

It may be that it takes some time to associate all the domains and subdomains. Has any change taken place recently, such as a new IP address, or more subdomains added to your 'collection' on the single IP address?

Ultimately, the decision of how many (sub)domains to host on a server should take crawling into account. Fifty thousand is a very high number -- about 125 times higher than a 'normal' shared hosting maximum for medium- to low-traffic sites. So, you might consider setting each crawl-delay to at least 120 if you really intend to host that many sites on one server.

Jim

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members