Forum Moderators: open

Message Too Old, No Replies

toCrawl/UrlDispatcher

another one

         

sanuk

6:10 am on Jun 12, 2003 (gmt 0)

10+ Year Member



Hi,

First thanks for the help yesterday from this forum.

This morning checked the logs again and found this one:
toCrawl/UrlDispatcher
from IP: 195.6.220.167

I did a search here but nothing came up
Someone knows what this is?

Regards,
Sanuk

sanuk

2:27 pm on Jun 12, 2003 (gmt 0)

10+ Year Member



Hi,

I have currently disallowed this robot.
Because it is eating to many pages one after the other
But I still dont know what it is.

Did I do wrong and what is it?

Regards,
Sanuk

wilderness

3:03 pm on Jun 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sanuk,
If the bot in its UA does not provide a description or URL of which you may read its intended use for your material/contents than IMO you, I or any other webmaster has no alternative except to deny access.

Please note in regards to a paralell thread, this also applys to bots which present themesleves as normal browser UA's.
This thread is a good example:
[webmasterworld.com...]

Don

sanuk

3:22 pm on Jun 12, 2003 (gmt 0)

10+ Year Member



hi,

Thanks for your reply Wilderness,
I will heep it dissalowed.

Best Regards,
Sanuk

sanuk

3:55 pm on Jun 12, 2003 (gmt 0)

10+ Year Member



Hi,

Found some more information searching Google.
Maybe it can help someone here at the forum.
It doesnt give a clue to me as coltfrance.com does not have a website.

Unknown robots from 195.68.98.xx (coltfrance.com)
toCrawl/UrlDispatcher
Textilus/toCrawl/UrlDispatcher
CEIS/toCrawl/UrlDispatcher

Best Regards everyone,
Sanuk

sanuk

8:34 am on Jun 13, 2003 (gmt 0)

10+ Year Member



hi,

This robot is not giving up!
since I have put it in my Htaccess file 36 hours ago:

RewriteEngine on
....more lines here...
RewriteCond %{HTTP_USER_AGENT} ^toCrawl [OR]
....more lines here...
RewriteRule /*$ [site-you-are-sending-the-bot-to.com...] [L,R]

it still keeps coming back every half minute since 2 days now even trying to access files that are in my CGI-BIN.

This is a very good prove that it does not obey robots.txt as I have a disallow to the CGI-BIN for all agents in the robots.txt

Of course now every time it gets a 302 error message and uses about 300 bytes for this, but every 30 seconds I stil get a line in my log-file from this one.

What can I do more, it's there every 20-30 seconds!

Sure No One Knows or Heard about this one before

Regards,
Sanuk

wilderness

11:12 am on Jun 13, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sanuk

If you change the last line of your Rewtite to the following:

RewriteRule ^.*$ - [F]

Eliminating that Redirect page in the process, the resulting KB use will be ZERO.

sanuk

11:55 am on Jun 13, 2003 (gmt 0)

10+ Year Member



Hi,

Thanks for the answer Wilderness
Will do as you say.

Regards,
Sanuk