Forum Moderators: DixonJones

Message Too Old, No Replies

Who can ID these 2 crawlers?

Looking for the sources for 2 spiders for filtering

         

JayCee

11:50 pm on Feb 18, 2002 (gmt 0)

10+ Year Member



Hi Gang!

My first post at WebMasterWorld.
Have already learned a great deal in the couple of hours I've browsed here.
Many thanks :)

I want to filter out "alexa" and "crawler.de", but don't know any more than just those names.

Whose agents are they?
What are their official IDs?

Thanks!

Air

2:46 am on Feb 19, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Alexa carries a User Agent of ia_archiver

As for crawler.de the best i can do is say that it is related to abacho.co.uk (for the english version) but I don't know the User Agent for it.

Rugles

9:49 pm on Feb 20, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ban Alexa if you can.

They will use up a huge amount of bandwidth every month.

SE_Enigma

10:02 pm on Feb 20, 2002 (gmt 0)



Can you place tags within your html that will prevent bots like alexa from indexing your page, and also prevent it from chewing up your bandwidth

Thanks,
SE_Enigma

JayCee

11:00 pm on Feb 20, 2002 (gmt 0)

10+ Year Member



Thanks for the info, Air!

Rugles, Other than the bandwidth hogging, what is Alexa actually doing that might be valuable to me?

Are they the web archiver folk? Or an SE that I might want to be listed with?

Anyone know if they respect the ROBOTS.TXT file?

Son_House

3:33 am on Feb 21, 2002 (gmt 0)

10+ Year Member



Are they the web archiver folk?

Yes

Anyone know if they respect the ROBOTS.TXT file?

It took them about a month but they finally stopped requesting pages from our site.

HandwovenRug

6:18 pm on Feb 23, 2002 (gmt 0)

10+ Year Member



>I want to filter out "crawler.de"

The user-agent is: Crawler admin@crawler.de
or: Crawler V 0.2.x admin@crawler.de
(The x means some version number)

>crawler.de the best i can do is say that it is >related to abacho.co.uk

I don't think so, Abacho has it's own robot:
AbachoBOT

bird

6:31 pm on Feb 23, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>crawler.de the best i can do is say that it is related to abacho.co.uk

I don't think so, Abacho has it's own robot: AbachoBOT

It may be that they use both crawler IDs, but it's still the same organization. They operate from Germany, and both crawler.de and abacho.de point to exactly the same site.

HandwovenRug

8:13 pm on Feb 23, 2002 (gmt 0)

10+ Year Member



Bird:
OK, both engines are indeed the same.

JayCee

9:35 pm on Feb 23, 2002 (gmt 0)

10+ Year Member



Thanks guyz!

And you can pat yourselves on the back too, 'cause I've asked this same question in 2 other SEO forums and got no response.

Nor have I seen much about Google's new PPC on those other forums, though I consider it major news.

Guess I'll have to become a regular around here!