Forum Moderators: open

Message Too Old, No Replies

bnf.fr bot

Digital National Library of France

         

keyplyr

8:31 am on Apr 22, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just got thoroughly scraped. Took robots.txt using my site as the referrer:

194.199.7.** - - [21/Apr/2009:09:31:34 -0700] "GET www.example.com/robots.txt HTTP/1.0" 200 3856 "http://www.example.com/" "Mozilla/5.0 (compatible; bnf.fr_bot; +http://bibnum.bnf.fr/robot/bnf.html)"

Don't know why they even bother since they say they do not need to obey robots.txt:

The robot is not limited by the exclusions specified in the robots.txt file, in accordance with the law (Article 41): "The implementation of a code or a restriction of access by these people [the producers or publishers of sites covered by the law] can not prevent the collection by the aforementioned depository bodies.
(translation by Google)

Samizdata

3:51 pm on Apr 22, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is your site registered or hosted in France?

On the assumption that this organisation is similar to the Library of Congress or the British Library, there may be a legal requirement to allow access in such circumstances (I wouldn't know) but presumably French law does not apply globally.

I would serve them a "quatre zero trois" myself.

...

wilderness

4:26 pm on Apr 22, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Although it's not a good idea to generalize, my personal experience with the French that I've communicated with, is that they comprehend most anything on the internet as "public domain"!

Attempting to communicate otherwise has proved futile, at least with the folks I communicated (or failed to communicate with).

The trend of "public domain" is not entirely limited to France or Europe. We have one or two generations of North American computer users that have not taken the time to comprehend the difference, as a result their actions are that of "public domain" users.

keyplyr

5:03 pm on Apr 22, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is your site registered or hosted in France?

No

I would serve them a "quatre zero trois" myself.

But of course.

blend27

7:12 pm on Apr 22, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



keyplyr,

You should stop(almost immediately) liking to your Robots.txt file from you homepage in the Big Blue-White-n-RED Letters and all the other sites that you run...

keyplyr

8:26 pm on Apr 22, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm sorry blend27, I have absolutely no idea what you're attempting to say.

wilderness

8:38 pm on Apr 22, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



keyplr,
he was just yanking your chain (teasing).

By suggesting the reason for your domain address in the refer was a result of a link to robots.txt in your active pages.

blend27

1:56 am on Apr 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



:), "liking to" should of been "linking to"

In order to see the refferer string in your logs, user must/should(not including MSNBOTS) follow/click it(link) from somewhere. if robot says they found it at your home page, the robot is full of flaiming colours like Blue-White-n-RED.

Just an Analogy: Tastes like Chiken.

keyplyr

9:34 am on Apr 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



<dense> OK... </dense>