Forum Moderators: open

Message Too Old, No Replies

Validating an entire site

HTML-Tidy, what else?

         

Mohamed_E

8:43 pm on Aug 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How do people validate an entire site?

I use HTML-Tidy with some shell scripts, it works well enough but I wonder what else is out there. I have a slow connection and would much rather do the validation locally.

NickH

4:12 pm on Aug 2, 2003 (gmt 0)

10+ Year Member



I use A Real Validator. This is a Shareware HTML validator for Windows; an offline version of the WDG HTML Validator.

Nick

MrSpeed

4:33 pm on Aug 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I use CSE validator has a batch mode which I use to validate entire sites.

The only issue is that you need to manually paste the urls you want validated. It won't crawl/spider spider the site.

NickH

6:07 pm on Aug 4, 2003 (gmt 0)

10+ Year Member



CSE Validator can validate an entire directory in batch mode, so it is suitable for validating a site, locally.

It's worth noting that, strictly speaking, CSE Validator is not actually a validator. Once you've configured it as desired, though, it is very useful, as it offers advice regarding style and accessibility.

From the CSE help file:

"NOTE: CSE HTML Validator cannot check for a completely syntactically correct document, although it can find many syntactic errors and offer useful advice and assistance in creating your documents.

NOTE: CSE HTML Validator is not a "real validator" in the strict technical definition of "validator". Instead, it is based on a powerful engine that is custom designed for HTML, XHTML, and CSS. Because of this, it is capable of finding many problems that a "real" validator cannot."

Nick

moltar

6:25 pm on Aug 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Is there a valiador that can crawl the site?

I have a really dynamic site which uses lots of SSI and I can use A Real Validator, because it can only validate files. And I also have many pages and don't want to paste them all in.

pageoneresults

6:38 pm on Aug 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hmmm, I've been looking for this functionality in an online tool and have not found it yet. The closest I've come is the WDG Validator [htmlhelp.com]. It will spider and validate 100 pages at a time. I've not figured out how to get it to the next 100 pages.

Make sure to check Validate entire site. I also uncheck Show input to minimize the amount of resources used from the WDG.

Mohamed_E

7:22 pm on Aug 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



NickH, MrSpeed,

Thanks for the pointers. I was hoping to find a tool smart enough to crawl the entire site. How much extra code would that involve, compared to the validation code?

pageoneresults,

Thanks for the link to WDG, funny that their online validator can crawl a site but that their stand alone version cannot!

With Tidy, and a UNIX-like shell, I can do a lot of things that I simply cannot do with a pure GUI product, so I guess I will stick to Tidy.

pageoneresults

7:37 pm on Aug 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Mohamed_E, you may want to browse the WDG HTML Validator Source Code [htmlhelp.com].

HTMLLinkExtractor.pm
This Perl module extracts possible HTML links from an HTML document when spidering a site.

I'm not a programmer, but I would think the above is what works hand in hand with spidering the entire site. If that is available from that page, maybe it is something that you have to physically incorporate so that you have that advantage locally. I'm guessing at this point...

Mohamed_E

8:24 pm on Aug 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for the link to the source code!

For the really hard core programmers they supply the source code to an offline validator [htmlhelp.com]. But that looks like a major undertaking, and in any case I am currently still operating in a WIN2K environment.

What I use with HTML-Tidy to get the files I want to validate is the Xenu link checker, saving the file and editing it to have a list of the files it found by crawling. I then feed these files, one by one, to Tidy. I do have to run Xenu manually, but all the rest can be automated.