Forum Moderators: open

Message Too Old, No Replies

ia_archiver?

         

Kamikaze

3:37 pm on Oct 2, 2000 (gmt 0)

10+ Year Member



IP: 209.247.40.103
UA: ia_archiver

Grabbed my robots.txt file. Hit my site pretty aggressively.
Anyone have a clue who they are?

oilman

3:45 pm on Oct 2, 2000 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Looks like alexa.com's spider.

oilman

4:07 pm on Oct 2, 2000 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You can find more info about their crawling methodology here:
[alexa.com...]

Ulrike

3:20 pm on Oct 22, 2000 (gmt 0)



That has answered my question. I took today for the first time a look at my raw log files and noticed that "ia_archiver" hit half of my content of '99.

Will this have impact on my site? Traffic etc.

Mike_Mackin

3:35 pm on Oct 22, 2000 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>Will this have impact on my site? Traffic etc.

It could if the surfer knows what to do.

Netscape is loaded with alexa.com ability - see "What's related" at top right.
Test it on webmasterworld.com - interesting :)

Last I heard Bill Gates IE shipped it in their folder BUT you had to install it.

Anyone know of the AOL browser uses it?

Brett_Tabke

4:45 pm on Oct 22, 2000 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



They are the only spider I've banned from all our sites without exception. Drop the url to your site to archive.org and they will remove you from the list.

Ulrike

5:57 pm on Oct 22, 2000 (gmt 0)



Thanks for the advice.

Ahem, what is "NG/1.0" ?

littleman

9:58 pm on Oct 22, 2000 (gmt 0)



Ulrike - Do you have an IP?

Ulrike

10:02 pm on Oct 22, 2000 (gmt 0)



194.214.109.89 - - [21/Oct/2000:03:25:07 +0000] "GET / HTTP/1.0" 200 6078 "-" "NG/1.0"

littleman

10:44 pm on Oct 22, 2000 (gmt 0)



I couldn't do a lookup on that IP but I have some RIPE info for
you
[ripe.net].
The address looks like it belongs to www.renater.fr [renater.fr].

haroldlp

3:06 am on Dec 13, 2000 (gmt 0)



Alexa.com is the worst excuse for a company we have ever seen. Their ia_archiver was turned loose on our online store over the weekend and managed to bring down our servers with so many hits they couldn't handle it. At this time we are looking at a Denial of Services suit agaiinst them. This is the 3rd time in 2 weeks they have hit us, and this after being warned by us and our ISP to cease. If anyone else has had similar problems with them let me know at harold@passionpro.com

Machiavelli

11:19 am on Dec 13, 2000 (gmt 0)



But they're a dot org - they must perfect.*

Have you seen their 'sculpture' [archive.org]. Curiously, there is a miny picture in the top left corner of a Greek style temple; perhaps this is to give the Praxiteles stamp of approval to it.

*not

cirelle

1:20 pm on Dec 15, 2000 (gmt 0)



I've just recently cut the cord on the ia_archiver as it has been abusive at best. My understanding is that this thing is used to update the browser plugin that shows psuedo site demographics when the visitor goes to that site. this is typically located at the bottom of the browser. I basically send it to a page that needs hits and not to the target site.

I was looking at hundreds of entries per day from it

c

generator

8:37 am on Apr 9, 2002 (gmt 0)



watch this one, don't download anyhing from alexa it will deposit several hundred spywares on to your computer, it looks harmless but handle with care

keyplyr

9:15 am on Apr 9, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you have banned them and someone attempts to view your site with their Wayback Machine, they give you up:

We're sorry, access to [my.site.com...] has been blocked by the site owner via robots.txt.
Read more about robots.txt
See the site's robots.txt file.

volatilegx

11:04 pm on Apr 9, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Wow interesting keyplyr!

toolman

11:10 pm on Apr 9, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>>>they give you up:

Thats why it's important to block them from the start by mod_rewrite.

I liken Alexa to Equifax or TRW...credit agencies. They collect information about you without your consent or knowledge.

On the surface this seems benign. But what if you had to answer questions under oath someday for whatever reason....why leave evidence out there that could be used against you when you don't have to?

misosoph

6:49 am on May 9, 2002 (gmt 0)

10+ Year Member



213.174.84.195 - - [08/May/2002:22:54:58 -0400] "GET / HTTP/1.0" 200 7588 "-" "NG/1.0"

The IP belongs to Sei-Mitsu Solutions Ltd (London, UK).

The only files requested were pages listed in the Yahoo directory and robots.txt.

Has anyone figured out what NG/ is? Is it sweetness and light or .....?

wilderness

3:33 pm on May 9, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



<snip>Has anyone figured out what NG/ is?>

a possibility?
[oasis-open.org...]