Forum Moderators: goodroi

Message Too Old, No Replies

boitho.com bot violating robots.txt

Specifically requested only forbidden files

         

jazzguy

8:08 pm on May 5, 2005 (gmt 0)

10+ Year Member



"boitho.com-dc/0.75 ( http*//www.boitho.com/dcbot.html )" came from 129.241.104.168. It specifically targetted disallowed files from robots.txt, ignoring all other pages.

The info page says it's a distributed crawler, so just like my policy for the cronic robots.txt violater Grub, I banned the user agent and the entire IP block associated with the offending IP.

jazzguy

7:23 pm on Jun 14, 2005 (gmt 0)

10+ Year Member



I never said that my bot is flawless, however I can't verify bug report if it lacks specifics.

Already covered multiple times in this thread. I offered more information. You rejected my offer. You kept on with the sarcasm and insults. I withdrew my offer.

Your attitude suggests to me that you were looking for some cheap bashing without any desire to back up your words with proof. Generally the people who act like that have no proof in the first place.

And your attitude suggest to me an arrogant jackass, which is why I withdrew my offer of assistance. I offered more data. You rejected the offer. Then you try to say that I have no proof because you rejected my offer to supply you more information.

This is all just repeating what has been covered multiple times in this thread.

jazzguy

7:25 pm on Jun 14, 2005 (gmt 0)

10+ Year Member



He violated your robots.txt? Could you post a copy of it here? Someone should be able to figure out if the mistake is with the spider or your ruleset.

Again, an obvious troll. That's already been covered multiple times in this thread.

Lord Majestic

7:26 pm on Jun 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



And your attitude suggest to me an arrogant jackass

I will let the readers judge on who is who :)

Since you progressed to name calling I take it that this discussion is pretty much over. I am still interested in your domain name to ensure future compliance with your robots.txt I am happy to add your site to a very short list of permanently banned sites. I promise not to actually look at your site, and in a few years I will tell you whether I noticed its abscense from the index or not :)

[edited by: Lord_Majestic at 7:27 pm (utc) on June 14, 2005]

bcolflesh

7:26 pm on Jun 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



...you try to say that I have no proof

Ah cool, I take it to mean you have proof - can you please post the robots.txt file for the forum members to examine?

jazzguy

7:34 pm on Jun 14, 2005 (gmt 0)

10+ Year Member



But from what i was reading you are not being very cooperative with Lord Majestic when he clearly is asking for the information he needs to look into the problem and you are not giving him the exact information he asked for.

That exact subject has already been covered multiple times in this thread so rather the repeat it here, I suggest you try reading the thread more objectively so you won't gloss over statements you might have missed.

Infact you could have stopped this 4 page fun read if you just gave him that information.

Or he might have stopped the thread by examining the information I offered rather than rejecting it as insufficient without even examining it.

Infact im rather amazed at how long this thread has went on for.

You and me both.

jazzguy

7:36 pm on Jun 14, 2005 (gmt 0)

10+ Year Member



I take it to mean you have proof - can you please post the robots.txt file for the forum members to examine?

Still trolling? Already asked and answered multiple times in this thread.

Lord Majestic

7:37 pm on Jun 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Or he might have stopped the thread by examining the information I offered rather than rejecting it as insufficient without even examining it.

There was nothing to reject, no information that could have reasonably lead to establishing whether something was wrong with my implementation of robots.txt standard.

bcolflesh

7:40 pm on Jun 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



...answered multiple times in this thread

I cannot locate the post with the robots.txt information in it - please post your robots.txt information so the forum can examine it and validate your claims about the member's spider.

jazzguy

7:43 pm on Jun 14, 2005 (gmt 0)

10+ Year Member



I take it that this discussion is pretty much over.

I thought it was over long ago since the same things just kept getting repeated over and over.

I am still interested in your domain name to ensure future compliance with your robots.txt

Already covered multiple times in this thread.

I won't quote the rest of your post since it's just more of the same sarcasm that fanned the flames of this thread to begin with.

rj87uk

7:43 pm on Jun 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Or he might have stopped the thread by examining the information I offered rather than rejecting it as insufficient without even examining it.

Dont worry I did read it all and again it all comes down to you posting your robots.txt - That has been covered so many times (you say) in this thread.

Infact (Yes I know again) - Why dont you post it now to shut everyone up :) You know - to add a twist to this thread!

This 111 message thread spans 12 pages: 111