Forum Moderators: open
I originally allowed scans by freedir.co.uk, although its owned by a South African company OrderWeb Software despite being a UK domain. It recently blocked itself and on looking for a reason I discovered it was on an IP adjacent to tags2dir, which I'd blocked as a single IP but recently blocked as a server farm 206.196.96.0 - 206.196.127.255 (InLink Communications Company).
Freedir comes back every day or so looking for sites, working on the general pattern...
example.com
www.example.com
It tries for 16 domains (default page) on the same server, making a total of 32 hits in as many seconds.
If that's all it's going to do I'm not worried providing it is a useful service - in fact I'd originally passed the crawler as ok.
Looking at the site now my feeling is that it is no longer a desirable one (if it ever was). There is no obvious ToC - in fact nothing at all apart from lists of URLs. What its reason for living is I can't determine but I wonder if it's corporate scraping and analysis.
I can no longer recall why I blocked Tags2dir but an entry in this forum suggests it was scraping and that was probably my own experience. It belongs to the same company as freedir - OrderWeb Software of South Africa.
freedir.co.uk and tags2dir.com are within half a dozen IPs of each other at 206.196.111.nnn (InLink Communications).
Does anyone have anything to add to this that will either endorse or contradict my conclusions?
In April 2007, a colo with the IP:
206.196.111.. . . . two-hundred and one
Under the umbrella of metatagsdir.
There's hordes of these directories and I don't see in actual benefit in allowing any of them.
They simply detract upon SERPS for actual web pages.
Course, I feel the same way for Wiki and all the under sub-orgs that allow users to create/duplicate pages restricting traffic to within the Wiki structure, when an actual organizational website existed previously.
It simply detracts SERPS and visitors away from their intended destination.
Don
I've had that range blocked for a while now - can't recall the date but for a few months. As I said, I originally allowed freedir through but I think it might have come from another IP range then. Can't be sure without checking back several months' logs.
The tags2dir web site seems turned off now, by the way. Not sure if tag2dir was a part of this - I began my posting by including it but I couldn't find any real evidence about it.
I wonder if OrderWeb Software may be scraping the net, changing the directories name/domain occasionally and then dumping them for some reason. I can't imagine why as very few people would naturally block bots - we in this forum are amongst a very few! :(
The serious online businesses may care about this stuff, but joe schmoe with his little hobby site, blog or photo gallery really doesn't care as it's just a hobby to them and not worth the effort.
Even if they knew where the log files were, most wouldn't know what they were looking at anyway.
Even if they did know what they were looking at, most wouldn't have the skills or tools to stop the problem.
Even if they knew what they were looking at and had the skill and tools to solve many of the problems, they probably wouldn't keep an eye on how the things they blocked were being used and re-purposed over time and could, in the long term, cause more harm than good.
For the most part, it's probably best the majority are blissfully ignorant of the situation and ignore it for their own safety.
[edited by: incrediBILL at 8:09 am (utc) on Feb. 15, 2009]
Each webmaster has different goals for their content and must decide on their own what bots, regions and providers are either beneficial or detrimental to their own site (s).
You may begin with there threads:
[webmasterworld.com...]
[webmasterworld.com...]
[webmasterworld.com...]