Welcome to WebmasterWorld Guest from 54.145.144.101

Forum Moderators: bakedjake

Message Too Old, No Replies

alexa crawler spider crawling fast

   
6:21 am on Aug 20, 2001 (gmt 0)

10+ Year Member



my sites have been getting crawled by alexa very agressively in the last few weeks. anyone else notice this.

does anyone know what they are doing with it. Their search in their toolbar just delivers goto listings. maybe they want to add their own stuff to the goto listings they display. Anyone has ideas on how many searches they get?

2:06 pm on Aug 24, 2001 (gmt 0)

WebmasterWorld Administrator ianturner is a WebmasterWorld Top Contributor of All Time 10+ Year Member



It could be that you have downloaded their toolbar. This may cause them to recognise a site and crawl it.

All the best
Ian

4:32 pm on Aug 24, 2001 (gmt 0)

WebmasterWorld Senior Member mivox is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I have never downloaded Alexa anything, and they crawl my site more regularly than almost anyone else... (except maybe AltaVista these days) and they've never delivered anything for traffic.
1:12 am on Aug 25, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




Alexa has been far more agressive lately. The other day, I was working on some not-for-public content. After uploading and testing some pages, I went to lunch.

When I came back, a glanced at my log files and saw that Alexa had spidered every single page I had just published, and they crawled it in the exact orger I viewed the pages. (I have their toolbar installed on this particular computer).

Now I've always been aware of the fact that data transmitted through the toolbar was used for crawls, but I don't ever recall noticing that they follow individual users so closely. (Keep in mind that I'm the only human being to ever look at the pages Alexa has been hammering).

It's a little freaky when you think about it.

1:31 am on Aug 25, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>>It's a little freaky when you think about it

Right on target WG.

Alexa is to the web what TRW and Equifax are to the financial world...they collect information about you without your consent. If you've ever seen the profiles Alexa has for each web site they crawl you would know what I mean.

The difference is this one little line can help your privacy:

deny from 209.247.40.

1:37 am on Aug 25, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, they definitely require an htaccess ban. I put them in the robots file, but they keep on comin'
5:27 pm on Aug 25, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>deny from 209.247.40.

deny from 209.247.41
deny from 206.132.186
deny from 64.41.180

I don't know which of those ranges they are actually using at the moment, but I have caught (and automatically banned) ia_archiver in all of them over the years.

Alexa is the most dubious of all search engine enterprises that I am aware of. Remember how they planned to build a "continuous archive of the complete www"?

11:39 pm on Aug 25, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Does anyone actually get traffic from Alexa? I don't think I've ever gotten a single hit from them.
2:57 pm on Aug 26, 2001 (gmt 0)

WebmasterWorld Senior Member jeremy_goodrich is a WebmasterWorld Top Contributor of All Time 10+ Year Member


See http://www.archive.org/ that is a collection of everything they spider (I think) I don't know a whole lot about Alexa, either.
1:09 am on Aug 27, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well now I know more about ia_archiver I think I'll ban it as well.

I've just looked that the homepage of alexa.com and seen that they've been a little bit naughty and have been taken to court. Anyone in the US who has downloaded and used the Alexa toolbar (pre v5) may be able to submit a claim. See the link on the homepage about court action.