Welcome to WebmasterWorld Guest from 54.163.40.152

Forum Moderators: incrediBILL

Message Too Old, No Replies

Should I encode ampersands in my canonical tags?

     

metamax

5:29 pm on Feb 22, 2013 (gmt 0)



Hi Everyone, hope this isn't the wrong place to post this.

I looked but can't find a straight answer on this (closest I found is [webmasterworld.com...] but seems incomplete)

Should my canonical tags have encoded ampsersands in them?

I thought we were supposed to, like we're supposed to in xml sitemaps, but on Google's support page itself, they don't:

<link rel="canonical" href="http://support.google.com/webmasters/bin/answer.py?hl=en&answer=35653" />

not
<link rel="canonical" href="http://www.example.com/blah?hl=en& a m p;answer=35653" /> (showing "& a m p ;" with spaces so it renders)

Would using encoded characters be wrong? would it cause problems?

swa66

9:48 pm on Feb 22, 2013 (gmt 0)

WebmasterWorld Senior Member swa66 is a WebmasterWorld Top Contributor of All Time 10+ Year Member



It's not because google doesn't do the right thing that they are correct ;-)

Seriouly: you must encode & as &amp; in html (and in xml).

lucy24

11:07 pm on Feb 22, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



In this specific case it will work either way. Try it:

(raw &) [support.google.com...]

(encoded &amp;) [support.google.com...]

Encode it anyway. "The right way" or "a good habit" does not always translate to "the only way that will work".

swa66

12:36 am on Feb 23, 2013 (gmt 0)

WebmasterWorld Senior Member swa66 is a WebmasterWorld Top Contributor of All Time 10+ Year Member



In all html (also in xhtml5) except for HTML5 you *have* to encode a content &.
Standards ...

HTML5 allows the author to write worse code than it needs to allow. So you can get away with it in that standard provided the characters following the & do not look like an htmlentity - and since HTML5 is now a "living" standard: you do not know what html entities they will invent in the future. So you cannot guarantee that in the future it will not start to "look like" an htmlentity.

A useless thing for lazy authors IMHO. - But HTML5 is stuffed with that kind of thing.

So not encoding every content & in an html document as &amp; is a mistake IMHO - of equal proportion to using < or > in the content that's not encoded as &lt; or &gt; .

Regardless of standards, browsers can recover from the error in many cases but let's assume you write:
<a href="http://www.example.com/file?a=1&copy=2">
It's an error, but which did you intend
<a href="http://www.example.com/file?a=1&copy;=2"> (insert a copyright sign, that's missing the semicolon ?)
or
<a href="http://www.example.com/file?a=1&amp;copy=2"> (an unescaped ampersand?)

Validators hopefully will continue to flag it as errors.

Hoople

9:09 pm on Feb 23, 2013 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



swa66: thanks for posting that.

It's such knowledge sharing that keeps me coming back to read years after the surge of new SEO focused websites that appeared after WW opened its doors.

lucy24

12:29 am on Feb 24, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



insert a copyright sign, that's missing the semicolon ?

Matter of fact, that element of browser helpfulness has annoyed me for a long time. Is the trailing semicolon required or isn't it? If it isn't required, why use it? If it is required-- which makes far more sense because how else would you know when the entity is finished?-- then for pity's sake require it already :)

swa66

1:03 am on Feb 24, 2013 (gmt 0)

WebmasterWorld Senior Member swa66 is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Browser helpfulness is only perpetuating lazy author's errors. Unfortunately HTML5 has "approved" that as the right way instead of outlawing it one and for all.
(except for xhtml5)

One of the main reasons I aim for valid xhtml1 in the past and polyglot xhtml5 nowadays. The bigger reason is to have the xml toolset to automate things if/when I need them.

metamax

5:37 pm on Feb 26, 2013 (gmt 0)



Thanks for the comprehensive replies everyone, very helpful.

Why is it a browser renders the & a m p ; URLs when clicked in a link, but not when pasted into an address bar? That's what caused the confusion in the first place.

I'm going to guess that it's because you're not meant to paste HTML code into an address bar... and expect anything :)

swa66

10:42 pm on Feb 26, 2013 (gmt 0)

WebmasterWorld Senior Member swa66 is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Your guess is actually quite right.

In HTML, you are supposed to (exceptions aside) to encode any & as &amp; . SO in a <a href=""> or so your encoding works as the rbowser knows it's reading html and will decode the &amp; to & and then use it.

In your address bar: there's no html, so no decoding of htmlentities.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month