Welcome to WebmasterWorld Guest from 54.227.110.209

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

DomainCrawler

What purpose does it serve?

     

GaryK

9:33 pm on Oct 18, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



DomainCrawler/1.0 (info@domaincrawler.com; [domaincrawler.com...]

The above UA, where example.com replaces my actual domain names, has been visiting my sites once a week for the last week or so. Does anyone know what DomainCrawler.com is all about? Is anyone blocking them from crawling their site(s)? Thanks.

markbiz

6:20 am on Oct 19, 2008 (gmt 0)

10+ Year Member



Are your domains expiring soon? They visited one of my domains today, but there's no content on it, it redirects to another site of mine. But this domain expires next month......

jmccormac

7:15 am on Oct 19, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Blocked here.

Regards...jmcc

GaryK

7:45 pm on Oct 19, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Last night this bot crawled about 30 domains, only a few of which expire soon. Based on that I added it to my banned list. Thanks, guys.

jmccormac

5:07 pm on Oct 20, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It seems to be scraping/caching whois data as well.

Regards...jmcc

zeus

10:27 pm on Nov 5, 2008 (gmt 0)

WebmasterWorld Senior Member zeus is a WebmasterWorld Top Contributor of All Time 10+ Year Member



yes there is getting more and more of those scraping/caching whois sites, its getting on my nervs

zeus

11:18 pm on Nov 5, 2008 (gmt 0)

WebmasterWorld Senior Member zeus is a WebmasterWorld Top Contributor of All Time 10+ Year Member



lets make a list of all those #*$!rapers

right now im blocking

<Limit GET HEAD POST>
order allow,deny
deny from 216.145.16.0/24
deny from 66.249.16.*
deny from 66.249.17.*
deny from 64.246.165.*
deny from 64.79.192.0/19
deny from 64.246.165.128/25
deny from 66.249.0.0/19
deny from 209.59.192.0/19
deny from 216.145.16.0/24
deny from 83.168.240.5
deny from 66.249.0.0/19
allow from all
</LIMIT>

but maybe we should place a name also and is there a option to block those on whole server, instead changing htaccess on each domain

GaryK

11:49 pm on Nov 5, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



One of the things I like about IIS is the ability to block an IP Address or range across all domains. Surely there's a way to do that with Apache.

zeus

11:55 pm on Nov 5, 2008 (gmt 0)

WebmasterWorld Senior Member zeus is a WebmasterWorld Top Contributor of All Time 10+ Year Member



yes I thought so, I think I have done that some time ago, but Im no server dude - I used some free software where I saw my server like windows folders, but I can remember the soft. or which folder I have made changes to.

Also lets have a list of bad #*$!raper IPs

I do have APache

jdMorgan

1:29 am on Nov 6, 2008 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I'd suggest something like this.

I removed IP address range duplicates and overlaps.

I also removed the <Limit> container, since using <Limit> as shown would have allowed other methods, such as PUT and DELETE -- without any restrictions.

I show a safer, friendlier Allow/Deny construct:

All IP addresses can access robots.txt (it is only fair to warn them that they are Disallowed, and many simple-minded robots will treat any error fetching robots.txt as permission to spider the entire site).

All IP addresses are also allowed to fetch the custom 403 error page, so that a 403 error when an IP address is blocked does not result in another 403 as the client attempts to fetch the custom 403 error document, and another 403 because that fails, and another, and another... (This avoids a looping self-inflicted Denial-of-Service attack.)

I also show an example of the syntax for blocking the "example.com" domain.


# Set EnVar to allow all IP addresses to access robots.txt & custom 403 error page
SetEnvIf Request_URI ^/robots\.txt$ allowit
SetEnvIf Request_URI ^/path-to-your-custom-403-error-page\.html$ allowit
#
# Configure so that Allow is the default state, and Allows can override Denys
Order Deny,Allow
#
# Not sure who this denies
Deny from 64.79.192.0/19
# Name-Intelligence, Whois.ess-cee, Do-main-Tools at Compass
Deny from 64.246.165.0/19
# More from Compass
Deny from 64.246.165.128/25
# Name-Intelligence at Spry Hosting
Deny from 66.249.0.0/19
# Domaincrawler at Chrystone AB
Deny from 83.168.240.5
# More Spry Hosting
Deny from 209.59.192.0/19
# Name-Intelligence at Compass
Deny from 216.145.16.0/24
#
# An example of hostname blocking
Deny from example.com
#
Allow from env=allowit

Note that adding this "Deny from example.com" directive will put the server into "reverse-DNS lookup mode." Unless your server logging is configured properly, you may find that your server now logs hostnames instead of IP addresses, which can be quite annoying. The only way to fix that is either to change the logging format (see mod_log_config) or to remove all Allows and Denys that use the "Deny from hostname" notation.

Jim

zeus

9:30 pm on Nov 26, 2008 (gmt 0)

WebmasterWorld Senior Member zeus is a WebmasterWorld Top Contributor of All Time 10+ Year Member



just got a a reply from my email, I told them to stop place all whois content on there site just to make money, they replyed with ok, but we are allowed to do that.

Here's a small snippet of godaddys privacy policy:
"We will share your information in order to comply with ICANN's rules, regulations and policies."

ICANNs privacy policy says that all information must be made public. If you don't want it to
be public it's possible to get around that by purchasing "Whois Privacy Protection" by godaddy.

and there own domain they also have privacy shield on it.