homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

What purpose does it serve?

 9:33 pm on Oct 18, 2008 (gmt 0)

DomainCrawler/1.0 (info@domaincrawler.com; [domaincrawler.com...]

The above UA, where example.com replaces my actual domain names, has been visiting my sites once a week for the last week or so. Does anyone know what DomainCrawler.com is all about? Is anyone blocking them from crawling their site(s)? Thanks.



 6:20 am on Oct 19, 2008 (gmt 0)

Are your domains expiring soon? They visited one of my domains today, but there's no content on it, it redirects to another site of mine. But this domain expires next month......


 7:15 am on Oct 19, 2008 (gmt 0)

Blocked here.



 7:45 pm on Oct 19, 2008 (gmt 0)

Last night this bot crawled about 30 domains, only a few of which expire soon. Based on that I added it to my banned list. Thanks, guys.


 5:07 pm on Oct 20, 2008 (gmt 0)

It seems to be scraping/caching whois data as well.



 10:27 pm on Nov 5, 2008 (gmt 0)

yes there is getting more and more of those scraping/caching whois sites, its getting on my nervs


 11:18 pm on Nov 5, 2008 (gmt 0)

lets make a list of all those #*$!rapers

right now im blocking

order allow,deny
deny from
deny from 66.249.16.*
deny from 66.249.17.*
deny from 64.246.165.*
deny from
deny from
deny from
deny from
deny from
deny from
deny from
allow from all

but maybe we should place a name also and is there a option to block those on whole server, instead changing htaccess on each domain


 11:49 pm on Nov 5, 2008 (gmt 0)

One of the things I like about IIS is the ability to block an IP Address or range across all domains. Surely there's a way to do that with Apache.


 11:55 pm on Nov 5, 2008 (gmt 0)

yes I thought so, I think I have done that some time ago, but Im no server dude - I used some free software where I saw my server like windows folders, but I can remember the soft. or which folder I have made changes to.

Also lets have a list of bad #*$!raper IPs

I do have APache


 1:29 am on Nov 6, 2008 (gmt 0)

I'd suggest something like this.

I removed IP address range duplicates and overlaps.

I also removed the <Limit> container, since using <Limit> as shown would have allowed other methods, such as PUT and DELETE -- without any restrictions.

I show a safer, friendlier Allow/Deny construct:

All IP addresses can access robots.txt (it is only fair to warn them that they are Disallowed, and many simple-minded robots will treat any error fetching robots.txt as permission to spider the entire site).

All IP addresses are also allowed to fetch the custom 403 error page, so that a 403 error when an IP address is blocked does not result in another 403 as the client attempts to fetch the custom 403 error document, and another 403 because that fails, and another, and another... (This avoids a looping self-inflicted Denial-of-Service attack.)

I also show an example of the syntax for blocking the "example.com" domain.

# Set EnVar to allow all IP addresses to access robots.txt & custom 403 error page
SetEnvIf Request_URI ^/robots\.txt$ allowit
SetEnvIf Request_URI ^/path-to-your-custom-403-error-page\.html$ allowit
# Configure so that Allow is the default state, and Allows can override Denys
Order Deny,Allow
# Not sure who this denies
Deny from
# Name-Intelligence, Whois.ess-cee, Do-main-Tools at Compass
Deny from
# More from Compass
Deny from
# Name-Intelligence at Spry Hosting
Deny from
# Domaincrawler at Chrystone AB
Deny from
# More Spry Hosting
Deny from
# Name-Intelligence at Compass
Deny from
# An example of hostname blocking
Deny from example.com
Allow from env=allowit

Note that adding this "Deny from example.com" directive will put the server into "reverse-DNS lookup mode." Unless your server logging is configured properly, you may find that your server now logs hostnames instead of IP addresses, which can be quite annoying. The only way to fix that is either to change the logging format (see mod_log_config) or to remove all Allows and Denys that use the "Deny from hostname" notation.



 9:30 pm on Nov 26, 2008 (gmt 0)

just got a a reply from my email, I told them to stop place all whois content on there site just to make money, they replyed with ok, but we are allowed to do that.

Here's a small snippet of godaddys privacy policy:
"We will share your information in order to comply with ICANN's rules, regulations and policies."

ICANNs privacy policy says that all information must be made public. If you don't want it to
be public it's possible to get around that by purchasing "Whois Privacy Protection" by godaddy.

and there own domain they also have privacy shield on it.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved