Msg#: 4581185 posted 4:06 am on Jun 5, 2013 (gmt 0)
An ongoing head-scratcher:
A few months ago I made a /legal page because I was told-- probably by someone hereabouts-- that you're supposed to have one. Fine. Whatever makes the humans happy.
Q: What do robots want with this page?
I don't mean the ones that scoop up anything and everything that's linked directly from the front page. And I don't mean the search engines that home in on anything new. (In this case, they can crawl but not index.) I mean the ones that get the front page, the legal page-- and nothing else. Are they programmed to scan all links, and grab anything with the word "legal" or likely synonym? It's in a subdirectory, so they're not just guessing; besides, I never got 404s for this name before the page existed.
What are they looking for? Whatever it is, do I want them to find it?
Msg#: 4581185 posted 4:18 am on Jun 7, 2013 (gmt 0)
Older versions of IE used to use your legal page to check to see whether you conformed to the P3P policy. This was a format that they were trying to standardize for websites, but it appears to have died. Perhaps there are sites looking at your legal policy pages that are trying to classify your policy in some similar way?
Msg#: 4581185 posted 5:40 am on Jun 7, 2013 (gmt 0)
:: detour to search engine ::
:: further detour to w3 dot org ::
Criminy. I've never even heard of that.
Note: The P3P specification will likely change over the next few months. As a result, you may have to update the P3P policy that you are creating now.
last revised $Date: 2002/01/31 10:39:19
Hm. The page is sooo old, its 8859-1 encoding seems to have slammed into a sitewide UTF-8 setting, with unfortunate results. (Normally you don't have to manually change your browser ... to the encoding the page already specifies. At least not in any browser newer than MSIE 5 :))
Msg#: 4581185 posted 8:00 am on Jun 7, 2013 (gmt 0)
When I said it was old I wasn't kidding. ;)
Msg#: 4581185 posted 9:43 am on Jun 7, 2013 (gmt 0)
Ah ha! Mine isn't in the root, but it is indeed called legal.html. And the contact page is called contact.html. There is a time and a place for originality. Maybe some robots have got legacy code dating from 2002 that kicks in when they see the name.
:: vague mental association with online form I once filled out to auto-generate a suitably nasty but wholly law-abiding letter to former landlord ::
There used to be sites
The w3 page has four links. One times out, one leads to a parked domain, one exists but no longer seems to have what the link promises ... and the fourth runs through a batch of questions that I thought were none of their business and made me want to sic my legal department on 'em, so I left.
Well, I hope they enjoy my page. It starts out with a lengthy quotation from a 1787 student newspaper. Surely this can't help but be beneficial to the robot.