Forum Moderators: open

Message Too Old, No Replies

Characters in Meta Tags

no no's in the tags

         

mansfield smooth

10:24 pm on Sep 19, 2002 (gmt 0)

10+ Year Member



I am revisting the meta tags for my site for the first time in a while and was wondering whether there are any characters that I should avoid in them.

Specifically I would like to shorthen the tags and put the "&" character in the description.

Is this advisable? And are there any rules regarding characters in the tags?

Thanks in advance

andreasfriedrich

10:48 pm on Sep 19, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As a general rule meta tags are not considered to be important for SEO purposes.

The description property is probably the most important one since it is displayed by some search engines on the SERPs.

The content attribute value may contain CDATA [w3.org] both in HTML4 [w3.org] and XHTML [w3.org].

You should use '&' instead of '&' in the content attribute. Other characters are ok as long as you specify the correct character encoding [w3.org].

mansfield smooth

8:39 am on Sep 20, 2002 (gmt 0)

10+ Year Member



Thanks Andreas.

I am not sure how I specify the correct character encoding. How do I do this?

Also I have had some success tweaking the title tag in the past, why is this not important?

jatar_k

8:47 am on Sep 20, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



The title tag is the most important but it is not, in essence, a META tag, it is a TITLE tag.

Splitting hairs but true.

For specifying character encoding andreas just gave you a great link. It tells all and much better than I could explain it.

mansfield smooth

9:01 am on Sep 20, 2002 (gmt 0)

10+ Year Member



sorry, forgot that title was not a meta, my mistake :)

I am finding the link a little above my head and have one question:

does a page have to have its "Content-Type" defined? if not does the browser default to one?

thanks

andreasfriedrich

2:03 pm on Sep 20, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



does a page have to have its "Content-Type" defined?

The Content-Type is a field in the HTTP header (14.17 Content-Type) specifying the media type (text/html, image/png) and in its charset parameter the character encoding. The HTTP protocol [ietf.org] does not require that field. In fact it specifies a default character encoding in the absence of the charset parameter.

Whether the HTML spec requires a valid document to specify its character set and encoding is a totally different matter. It must be answered from the HTML spec:

To sum up, conforming user agents must observe the following priorities when determining a document's character encoding (from highest priority to lowest):
  • An HTTP "charset" parameter in a "Content-Type" field.
  • A META declaration with "http-equiv" set to "Content-Type" and a value set for "charset".
  • The charset attribute set on an element that designates an external resource.

In Using Character Encodings [htmlhelp.com] Liam Quinn writes that [a]n HTML document must specify its character encoding. However, I did not found evidence to support that in the HTML4.01 Spec. In section 5.1 The Document Character Set [w3.org] it says that SGML requires that each application (including HTML) specify its document character set. Character set and encoding are not the same (see note at the bottom of this post). So I´m a bit baffled by that. To me it seems, that you are not required to specify the character encoding. Although I would suggest you do, if you want your pages to show correctly. Perhaps someone more knowledgeable in these matters may shed some light on that.

if not does the browser default to one?

The HTTP protocol ([RFC2616], section 3.7.1) mentions ISO-8859-1 as a default character encoding when the "charset" parameter is absent from the "Content-Type" header field. In practice, this recommendation has proved useless because some servers don't allow a "charset" parameter to be sent, and others may not be configured to send the parameter. Therefore, user agents must not assume any default value for the "charset" parameter.

Fourth paragraph after the Specifying the character encoding [w3.org] heading.

There is also a section on Using national and special characters in HTML [cs.tut.fi] on Jukka Korpela´s excellent IT and communication [cs.tut.fi] website.

Hope this helps

Andreas

------

At least in this context they are not. Evidence: same level section headings in the HTML spec and the following quote:

The document character set, however, does not suffice to allow user agents to correctly interpret HTML documents as they are typically exchanged -- encoded as a sequence of bytes in a file or during a network transmission. User agents must also know the specific character encoding that was used to transform the document character stream into a byte stream.
In RFC 2616 - HTTP Protocol [ietf.org] they are used interchangeably. See note in section 3.4 Character Sets.

mansfield smooth

8:52 am on Sep 21, 2002 (gmt 0)

10+ Year Member



Andreas

Thats more than helpful!

Thank you very much indeed :)