Welcome to WebmasterWorld Guest from 18.104.22.168
Forum Moderators: goodroi
That's new. Obviously they try to save server load. Now i have to either email them or load/extract the rdf ... sigh.
To be added to our list of allowed robots, please email the staff programmer
You think they let a small, new niche search engine crawl a small category as a starting point?
Absolutely hutcheson. But i really don't think they bother changing their robots.txt just for me - not yet a search engine - i'm at the beginning of building it, the domain has been registered this morning ... ;)
I'd have thought that most webmasters here would welcome the fact that dmoz appears to have put in place measures to ensure that submissions and public page updates work properly again.
I'm one of those webmasters who are happy with that fact. I perfectly accept and understand what they do. I wasn't moaning. ;)
Or grab a RDF and use it for data. Rumor is, that's what Google did.
It has nothing to do with "big boys" or "little boys". AOL was a "big boy" when they started using the ODP. Google was NOT. We were glad to see both of them.
It has to do with "bots that are known to the staff to be legitimate and well-behaved." Fast either hasn't asked, or was observed to be misbehaving.
And, by the way, the cretins who were selling scripts so all their littermates could spider the ODP EVERY DAY -- are the REAL jerks who REALLY got the legitimate "little guys" shut out. For five years, the ODP was completely open to spiders. So -- put your abuse and contempt where it so deservedly belongs.
I'm not staff, and can't read their mind, but it's obvious that their concern is server load, and their concrete questions will be along the lines of:
-- How often is the spider going to run?
-- How hard will it pound the server?
-- Is its purpose to do something antisocial and sleazy, like collecting e-mail addresses or expired domains for resale to bottom-feeding spammers?
-- Who else will ever use the spider?
-- Do you really need to hit dmoz.org, or will a mirror or RDF be good enough?
If Fast cares, they could undoubtedly get official permission to spider. They are a known SE, with a legitimate need, and they could surely make their bot behave.