Welcome to WebmasterWorld Guest from

Forum Moderators: goodroi

Message Too Old, No Replies

Why is WebmasterWorld robot validator giving back HTML



10:50 am on Mar 10, 2007 (gmt 0)

5+ Year Member

2 sections in my robots.txt file:

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/ (rest of section just goes on with other dirs)

# allow google image bot to search all images
User-agent: Googlebot-Image
Allow: /*.gif$
Allow: /*.png$
Allow: /*.jpeg$
Allow: /*.jpg$
Allow: /*.ico$
Allow: /*.jpg$
Allow: /images/

nothing more in file

When I use the WebmasterWorld robots text validator I get line after line like these:

ERRORInvalid line:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd">
3ERRORInvalid fieldname:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
5ERRORInvalid line:

It is reading the HTML text of my index page. What is going on?

This is in the head section of the page:

<meta name="robots" content="index, follow" />

Can't think of any more info that might be relevant.

Thanks much.


2:32 pm on Mar 10, 2007 (gmt 0)

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

Are you sure you're specifying the full path to the robots.txt file?


Also check you're not specifying capitals (eg. Robots.txt) or you will get a 404 not found error.


7:11 pm on Mar 10, 2007 (gmt 0)

5+ Year Member

duh --
I wasn't giving the full URL-- for hours I'd been typing in just my domain name and that of a plagiarizer I'm in the process of informing on and just kept doing it.

Sorry to have dirtied up the list with an idiot post!

But thanks.


Featured Threads

Hot Threads This Week

Hot Threads This Month