tangor

msg:4485381 | 9:01 am on Aug 16, 2012 (gmt 0) |
Another example of why whitelisting is the way to go. Pick your battles, and manage them before the "war" starts.
|
jmccormac

msg:4485383 | 9:12 am on Aug 16, 2012 (gmt 0) |
A British Library record for a website might be a very powerful thing if it came taking copyright over an infringement but the idea of the BL approaching the web as it does with print is interesting. Possibly one to whitelist. Regards...jmcc
|
g1smd

msg:4485471 | 1:35 pm on Aug 16, 2012 (gmt 0) |
I think I'd let that in for sites with factual content and that are going to be archived for long term storage.
|
Samizdata

msg:4485527 | 4:26 pm on Aug 16, 2012 (gmt 0) |
| "This work is undertaken in anticipation of forthcoming Legal Deposit regulations" |
| It would be interesting to see what the "forthcoming regulations" actually say - when I edited a magazine it was a legal requirement to send copies of each issue to the Legal Deposit libraries (no doubt web publishing will be treated differently). How the bot is programmed would also be interesting - it should have no business "harvesting" from non-UK servers, but content can be hosted anywhere. | "to collect, preserve and provide long-term access to the UK’s online intellectual and cultural heritage" |
| That seems to rule out the vast majority of UK websites. But they would still have to crawl all of them to decide what is worth preserving. And it only takes the stroke of a politician's pen to make access a legal requirement. | My worry would be: who will ultimately be able to view the details and how; and can it be easily scraped from them. |
| If the project goes ahead I would expect the content to be globally scrapable. But at least it isn't the Wayback Machine. ...
|
dstiles

msg:4485585 | 7:01 pm on Aug 16, 2012 (gmt 0) |
Nor is it in the BL database (at least, not at the moment).
|
|