Page is a not externally linkable
lucy24 - 8:32 am on Oct 22, 2012 (gmt 0)
Some web editing programs create these attributes automatically, and therefore they aren’t very reliable when trying to determine the language of a webpage.
Yeah: they put in <lang = "en">. So if it says <lang = "something else"> shouldn't that be taken as a pretty strong indicator that the page is in some other language?
All the more so when you've got <lang> tags around small discrete sections of the content. I've grumbled elsewhere about g###'s translation of the single line "grazie a tutti" into Italian-- happily ignoring the <lang="it"> tag and therefore making, let us not put too fine a point upon it, utter fools of themselves.
Not long ago, I found a log entry telling me that google had attempted to translate a particular page into Italian. Problem is, the page in question is already in Italian.
Sample. I assure you I am not making this up.
"Original English [sic] Text":* Questa pagina ha sempre avuto un insolito numero di visitatori provenienti dall'Italia.
Google translation: This Pagina Semper ha avuto delle Nazioni Unite insolito Numero di Visitatori provenienti DALL'ITALIA.
Clearly google's definition of "obvious" is different from yours and mine.
* Created by a multi-stage process: Run the English text through :: cough-cough :: Google translate. Go over it myself and fix the blatant errors. Find a kind Italian to fix my fixes. (Being several thousand miles away, I could not hear her laughing hysterically.) Check some obscure technical terms and run it past the Italian again.