homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

Doctype and Encoding for php
What do you use?

 12:12 am on Aug 9, 2004 (gmt 0)

I'm on a site that validates your site and checks for errors... it's telling me I have no doctype or encoding used, so it uses Doctype HTML 4.01 Transitional, and that it's using encoding utf-8.

It later says my page is not a valid HTML 4.01 Transitional page...

I have a site taht loads up a page then loads up dynamic pages and a menu... what would you use as a doctype? And the encoding, what woud you use for that? How do you possibly know!?



 4:45 am on Aug 9, 2004 (gmt 0)

The doctype depends on what your html looks like. It doesn't really matter whether the page is being generated with PHP or not, but just upon what the final code sent to the browser looks like. HTML 4 Transitional is probably a good one to start with. The W3C maintains a list of valid doctypes [w3.org] that you can choose from.

As for character encoding, the key thing is that it matches the actual character encoding. If you are saving your files with software that uses Windows-1252 (you probably aren't unless you're using fairly old software), you need to use that for your character set declaration. There is an excellent "Ask the W3C" article on the subject [webstandards.org] from the Web Standards Project as well as a useful note from the W3C itself [w3.org] that should help you straighten it out.

For more extensive reading, check out the W3C Tutorial on character sets and encoding [w3.org]



 3:56 pm on Aug 9, 2004 (gmt 0)

I use xhtml transitional since php already tries to produce xhtml like markup in its html functions like nl2br() - and for a number of other reasons as well (more accurate web design - html transitional often sends browsers into 'quirks mode', which makes things render differently). Most people who are really interested in validation either go xhtml transitional or html strict. You really don't want to use an xhtml strict dtd since a lot of browsers can't handle that very well.

No huge differences between xhtml and html, no big learning curve - read more about it at w3schools.

And if in doubt, just copy the header of ****, try to write valid code, and see if it validates.

Ergophobe: thanks for the wasp article on encodings!


 7:27 pm on Aug 9, 2004 (gmt 0)

The validation I use is on W3 but I can't pick which one suits my site. I've picked every doctype in the list and it says i'm not valid for that. I've even done the xhtml ones and it reads not valid for your page.

As for my character encoding I use notepad for any HTML and content, and the css file makes all the characters Times New Roman if that makes a difference on which encoding i need.


 10:21 pm on Aug 9, 2004 (gmt 0)

You might be misunderstanding what it means when it tells you the page is not valid for that doctype. You have to pick the doctype you want to validate against, put that doctype into your document, and then *fix* whatever errors the validator spits back at you until you are told your page is valid.


 1:02 am on Aug 10, 2004 (gmt 0)

all the characters Times New Roman if that makes a difference.

No difference at all. The character encoding and the font are independent beasts. The font declaration (be it in html or in an MS Word document) is basically a suggestion to the OS on how a given character should look. The encoding is the code that tells the OS what the character is.

Not to get too specific without know the details of what you're doing, but I'll guess that you are using just the Western European character set and you are working on Windows. In that case the most likely encodings are UTF-8, Windows-1252 or ISO-8859-1.

There is an easy way to tell whether or not you are using Windows-1252 or ISO-8859-1. Create an html page with some of the following characters
- an "oe" ligature
- some curly "quotes"

Now look at in your browser. Set the character encoding to Windows-1252 in your browser. If it looks wrong, you are probably using ISO-8859-01 encoding. If it looks right, you're probably using Windows-1252.

Telling the difference between ISO-8859-1 and UTF-8 is not possible for this character set since ISO-8859-1 and UTF-8 codes overlap fully (i.e. ISO-8859-1 is a subset of UTF-8 and you'll only see differences if you were using something else in the ISO-8859-* family).

I hope that's helpful rather than just confusing.



 2:09 am on Aug 10, 2004 (gmt 0)

Is there some way for you to set the default for your site to "not validate" so that you can get a better handle on what's really happening?

As with so many "host-offered" functions, this one may cause more troubles than it prevents....


 8:34 am on Aug 10, 2004 (gmt 0)

Actually the character set did help, now I see what your saying and I've been staring at utf-8 and now i think i'll use that because it seems to be the most commonly used. As for the doc type I chose HTML Strict and as Sonjay said made sense, whatever it spits out at me i'll fix, so i'll look for the one that my page looks right in and has the least errors. only problem is W3 doesn't explain a few things very clearly -.-

IT says for character encoding in html to use the meta tag, so i'm assuming th'ts correct, yet in some examples t hey use is it where the doctype is defined. And with teh doc type, when you add that code is THAT the regular HTML tag or do u put in the required DocType then below that add teh HTML tag?

By the way thanks for taking the time to explain and help :)


 9:04 am on Aug 10, 2004 (gmt 0)

The doctype is the first thing on the page - then you put in the HTML tag, like this:

<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">

Why the linebreaks? There's a bug in Netscape 6 that this cures.

Note also that if you put anything above the doctype, IE6 will drop from 'Strict Mode' to 'Quirks Mode'. In Strict Mode, it displays code according to the standards, more or less, but Quirks Mode is for old HTML pages that are full of unclosed tags and such. It allows for many 'quirks' that still render the page how people were used to seeing in older browsers.

Try to use Strict Mode at all times. Alas this means that if you also use XHTML, you cannot add the XML declaration on the top line, because of IE6.

Be careful saving files from Outlook Express too - it can add a comment at the top of your files that throws IE6 into Quirks Mode! That happened to me once - it took me ages to figure out why my layout was suddenly not displaying as it should.


 6:06 pm on Aug 10, 2004 (gmt 0)

wait wait wait... i'm using html and xhtml? I don't use XML I use php, or is it really the same thing -.-

And if I had both wouldn't that create an error since I have two types? Or is it really how it should be haha.

So that's gonna go up top then make a head tag and put in the meta tag with the encoding (using notepad for all this) correct?

That was the most detailed i've seen yet ^.^


 12:21 pm on Aug 11, 2004 (gmt 0)

Sorry, my example was from an XHTML file. Use a similar format but with whichever doctype you've chosen.


 5:21 pm on Aug 11, 2004 (gmt 0)

So that's gonna go up top then make a head tag and put in the meta tag with the encoding (using notepad for all this) correct?


Best bet for the least pain is this

- use an html4 transitional doctype

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"

- use a simple <html> tag without xml or lang attributes
- put the encoding in a meta tag. This should be your first tag after the <html> tag because, once encountered, if the encoding is dfferent from what the browser was expecting, it has to start parsing all over again.

<meta http-equiv="Content-Type" content="text/html;charset=utf-8">

Then run it through the validator and see how it goes. If you get errors, try to follow the suggestions offered by the validator. If you can't work it out, post questions about those errors in a new thread over in the HTML forums, since these are really HTML questions.

Good luck!



 8:36 pm on Aug 11, 2004 (gmt 0)

Oh this is html >.< oops, but i was wondering if there was a PHP form, but i will do that, now i ahve to post a whole new one for variable lol.


 9:33 pm on Aug 11, 2004 (gmt 0)

i was wondering if there was a PHP form

I don't think I understand what you mean?

- doctype, <html> tags and encoding hints are for the benefit of the browser
- PHP is on the server so by definition there is no PHP equivalent.



 10:19 pm on Aug 11, 2004 (gmt 0)

Lol just like XHTML and HTML i was wondering if a phtml (php) doctype was out there ^.^


 8:24 am on Aug 12, 2004 (gmt 0)

No need - all PHP does is create HTML. The user doesn't see the PHP, just the HTML it outputs.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved