Forum Moderators: open

Message Too Old, No Replies

Firefly

         

WebJoe

7:53 pm on Jul 1, 2003 (gmt 0)

10+ Year Member



For the last two months I noticed a bot hitting all my sites, usually about 30 requests per day, pretty much every day. It identifies itself as

2003-06-10 03:24:05 193.7.255.242 - xxx.nn.yyy.mm 80 GET /default.asp - 302 Firefly/1.0+(compatible;+Mozilla+4.0;+MSIE+5.5) -

The 302 is because I banned the bot when I noticed at the initial hit that it ignored the robots.txt.

I have seen it coming with different IP addresses:
213.193.30.242: Fireball Hamburg, a Germany-based search engine, according to various web-articles the creator of that bot, now owned (or at least run) by Lycos
193.7.255.242: Gruner + Jahr AG & Co, one of the biggest publishing houses in Germany, obviously just using the same bot (for whatever purpose, they dont' have any search service on their site)

As I found out in an online article, Firefly has had problems with robots.txt in the past, and to me it seems that their dev-team isn't pushing to hard to get this fixed (see www.kso.co.uk/de/stats/fireball.html, sorry the article is in German).

This is just FYI, as a search of WebmasterWorld didn't turn up anything usefull - so i thought I'd record it for the benefit of anyone else who looks for it in the future.

WebJoe

11:59 am on Jul 2, 2003 (gmt 0)

10+ Year Member



UPDATE (02.07.2003):
I wrote an email to both the tech contact and webmaster@fireball (couldn't find any other contact) yesterday explaining my problem.
Today I received an email from the technical director of Lycos Europe Ltd. stating that they aren't aware of the problem, are complying with the robots exclusion standards and are only spidering pages allowed by ROBOTS.TXT.

I wrote back with the log-extract and robots.txt-file proofing them wrong. Let's see what happens...