homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / HTML
Forum Library, Charter, Moderators: incrediBILL

HTML Forum

Language identifyer "&lang=" in URL being interpeted as "<="
Both Google and Yahoo are caching &LANG= in URL as <=

 2:56 pm on Mar 30, 2009 (gmt 0)

I haven't run into this before, but perhaps someone else has. I'm consulting on a site with multi languages that uses the variable name "LANG" in the URL.

This results in URLs like


Both Google and Yahoo are caching these URLs to read


Now, I understand that &lang; is html code for the less than sign that is appearing, but the URL is not terminating with a semi-colon, so I find it more than a little strange. The fact that both Y and G are misinterpreting it is even stranger.

I suppose I could recommend changing the language variable name, but is there a more elegant solution that wouldn't require a back-end change and the complications that come with it?




 3:17 pm on Mar 30, 2009 (gmt 0)

Unfortunately, the search engines are entirely correct - the semi-colon is not obligatory to make an entity reference. You get the same issue when you use the variable
&copy which gets turned into a copyright sign.

This is why the ampersands in all variables in URL links must be encoded as &amp; - in most cases the browser (or SE bot) can handle unescaped ampersands, but not always.

So you need to modify your code to use &amp; in on-page links (ie. in the HTML) at all times:


The W3C validator will show the unescaped ampersands as errors if you validate the generated page. This is actually a good example which shows that search engine bots really do respect standards and prefer valid HTML. :)


 3:39 pm on Mar 30, 2009 (gmt 0)

Thanks for the clarification encyclo - thorough as usual. Looks like no quick fix for this site, though admittedly this isn't causing huge problems at the moment. Cost/Benefits time.


 4:28 pm on Mar 30, 2009 (gmt 0)

indeed an excellent example of why validation should be done.

Fiver: who not do the global substitution in URLs ? all "&" that are not followed by "amp;" get replaced by "&amp;"

Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / HTML
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved