Forum Moderators: coopster

Message Too Old, No Replies

PHP-based HTML validator?

something to say "your HTML is awful"

         

httpwebwitch

3:30 am on Apr 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



anyone know of a good PHP-based HTML validator? something that resembles what they have at W3C, or maybe something that can point out obvious syntax errors and tag typos? open-source?

or, any suggestions how to build one from scratch? (I could translate something existing in Perl, Java, ASP, etc)

httpwebwitch

3:32 am on Apr 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Maybe it would load an HTML file, build an XML DOM and validate each node according to type using library of valid tags and attributes...

ergophobe

7:35 pm on Apr 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You could use SP, which is the engine that the W3C validator uses, and just access if from PHP as a system call.

httpwebwitch

1:15 am on Apr 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



um OK it seems like you're saying something smart but I don't understand it at all

could you elaborate?

encyclo

1:22 am on Apr 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You can get everything here: [validator.w3.org...] - but the W3 validator's written in perl, not PHP. You could be brave and rewrite the thing (don't forget to release the code if you do!), but if you've got perl available, it might not be worth the hassle.

The W3 validator has gone through years of development, and is going to be much better than anything built from scratch in a hurry.

An alternative (also written in perl) is the WDG validator: [htmlhelp.com...]

ergophobe

3:10 pm on Apr 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sorry, I was off for a few days. SP was the SGML parser that the W3C validator used to be based on if I'm not mistaken (and used to be linked from there page about their validator - that's how I found it). Anyway, you can compile it as a binary and run it as a system call. A valid document returns no output, so you would have an easy test for a well-formed document.

[jclark.com...]

I sue it locally on my computer all the time and have not noticed differences with W3C in terms of what is valid (in terms of interpretation, etc, yes). That said, I think encyclo is right and you will probably do better with the W3C code since there is probably no validator that is more extensively tested than that.

Tom

httpwebwitch

3:41 pm on Apr 30, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for the tip - I checked out the W3C Perl source (yikes) - I don't think I'll be translating that to PHP any time soon. I'd probably just get it finished by the time W3C comes up with HTML version 12.

I might find a way to send my code to W3C and grab the output with a GET request or somethin'. This thing won't be used very often, it's not worth investing months of development time.

Thanks again! httpwebwitch

phpGOD

4:39 pm on May 11, 2004 (gmt 0)



>> You could use SP, which is the engine that the w3C validator uses, and just access if from php as a system call.

I have downloaded SP and I have no idea how to use it. I am usually fairly good with learning unix commands, but not this one.

I have some xhtml 1 transitional documents that I would like to validate, how on earth do i do it?

Eventually I would be doing exactly what you have said - do a system call. And I am not going to build a fully fledged validator, just make note of the pages that dont validate and use w3's validator to get more info.

I have been unable to find any examples on the net...