Forum Moderators: open

Message Too Old, No Replies

How to block W3 validator from accessing your site?

         

raumgleiter

8:39 am on Jan 12, 2009 (gmt 0)

10+ Year Member



It was a bit difficult to put this in the right category but I guess HTML was the best choice:

I am trying to figure out how you can block the w3 validator from accessing your site. As an example I looked at <a particular website>.

Go to http://validator.w3.org and enter the URl there and you see what I mean. It will show as "couldnt access page...... 403 forbidden".

How did thy do this? Thought first it had to do with the robots....but it apparently is something else as I tried copying the robots into a test site...but after that the site could still be accessed by the validator.

Thanks for any help.

[edited by: tedster at 9:14 am (utc) on Jan. 12, 2009]
[edit reason] no specific websites, please [/edit]

tedster

9:21 am on Jan 12, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This is easily done on any web server - for instance, on an Apache server you might use an .htaccess rule. See this thread in our Apache forum, for one discussion: How to block IP Address? [webmasterworld.com]

You can find more threads on the topic with Site Search [webmasterworld.com]

Wlauzon

12:10 pm on Jan 12, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I guess my question, as always with these type of questions is - why would you want to?

phranque

1:18 pm on Jan 12, 2009 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



some webmasters forbid any user agent that doesn't look like a browser.

Wlauzon

6:23 pm on Jan 12, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, I can see blocking as a group, but there are far too many to just try and block each one individually.

Samizdata

7:34 pm on Jan 12, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you really want to block it the current user-agent is:

W3C_Validator/1.606

Presumably the version number changes occasionally.

...

encyclo

8:30 pm on Jan 12, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The W3C IP addresses are very stable, you can block the validators that way.

why would you want to?

There are several options on the validators to do recursive checks on a site (eg. the link checker), so there can be a bandwidth impact.