Forum Moderators: open

Message Too Old, No Replies

Exalead - massive spidering

I know this one has been around a while

         

Bewenched

2:43 am on Apr 28, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



83.167.62.163

Nearly 7000 pages scarfed up today.

I know this one has been around for a while, but I cannot remember if they are good or bad.

Sgt_Kickaxe

3:47 am on Apr 28, 2011 (gmt 0)



They bill themselves as "A Search Engine at the Heart of the Semantic Web" as well as a big player in social search. Their site suggests they just turned on Asian and European language capabilities. Perhaps they now see Google realtime as a competitor and they unleashed the bots? Perhaps they are freshening up their content now that they finished working on two new languages (for them)? They don't say.

All the more reason to whitelist user agents and not add new agents until they start bringing in real traffic to a smaller site or blog of yours that uses traditional robots.txt. In the end it's about traffic, if a bots parent company brings none the bot hasn't earned it's welcome imo.

Bewenched

4:02 am on Apr 28, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I agree. That many pages unless your name is Google, Yahoo or Bing/msn forget it.

We're ecommerce and i'm so freakin sick of the scrapers and such.

lucy24

7:26 am on Apr 28, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Oh, hey, I know them. They show up every 4-5 days, pick up robots.txt and some other single page selected entirely at random, and go away again. If anyone ever figures out what they're looking for I'd like to hear it. Possibly they would like to hear it too, if they're based in Paris but go around calling themselves "AS Confederation of Neotelecoms, euNetworks AG and Upstreamnet gmbh". Maybe I'm mistaken in reading AS as A/S, but they've still got a bit of linguistic confusion there.

dstiles

7:53 pm on Apr 28, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Can't say I've ever had a problem with exabot but I block exabot-thumbnails. I'll keep an eye open.

Frankly, anything that's a good competitor for google is fine with me right now. :)

Pity they use google-ads on the site.

Bewenched

3:25 pm on Apr 29, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I hear ya, but thousands of pages in a short number of time throws a red flag for me.