Forum Moderators: open

Message Too Old, No Replies

semanticdiscovery/0.2

has problems with robots.txt...or a forged UA

         

WebJoe

7:40 am on Oct 13, 2003 (gmt 0)

10+ Year Member



the mentioned bot (full UA: semanticdiscovery/0.2(http*//www.semanticdiscovery.com/sd/robot.html)) belongs to a service provider. In the mentioned page they state that they will obey robots.txt, but today it hit my trap

Extract from my robots.txt:

User-agent: *
Disallow: /bots/

and what my logs recorded:

2003-10-13 07:07:38 68.166.53.158 - xxx.yyy.xxx.zzz 80 GET /robots.txt - 206 semanticdiscovery/0.2(http://www.semanticdiscovery.com/sd/robot.html) -

5 minutes and a couple of GETs later

2003-10-13 07:12:10 68.166.53.158 - xxx.yyy.xxx.zzz 80 GET /bots/trap.asp - 200 semanticdiscovery/0.2(http://www.semanticdiscovery.com/sd/robot.html) -

I wasn't able to verify the IP, it resolves to a US west coast-based ISP, and semanticdiscovery.com has w NY-based ISPs IP.

So my question is: Is the UA forged or not? Has anybody seen this before?

WebJoe

7:50 am on Oct 13, 2003 (gmt 0)

10+ Year Member



Stupid Q, older post 1732 [webmasterworld.com] and others.

So, still the same problem...not totally comlying with robots.txt

WebJoe

8:04 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



after posting here I sent an email to semantic and got a promising reply:
Wow, *WebJoe*, I apologize for that! Something seems to have gone wrong our robot's behavior -- thanks a lot for pointing this out to us so we can get it fixed. And yes, that IP is one of ours, so nobody has hijacked our robot.

Thanks,
Nancy *Qwerty*
Semantic Discovery

I assume that Nancy is sleddogcafe [webmasterworld.com]

but after reading all the threads here about that topic I decided to keep it banned. Plus my experience tells me that problems like this aren't always real bugs but features and don't get fixed as quickly as promised.

[edit reason: anonymized (*)]