Forum Moderators: goodroi

Message Too Old, No Replies

Why is WebmasterWorld robot validator giving back HTML

         

Clair

10:50 am on Mar 10, 2007 (gmt 0)

10+ Year Member



2 sections in my robots.txt file:

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/ (rest of section just goes on with other dirs)

# allow google image bot to search all images
User-agent: Googlebot-Image
Disallow:
Allow: /*.gif$
Allow: /*.png$
Allow: /*.jpeg$
Allow: /*.jpg$
Allow: /*.ico$
Allow: /*.jpg$
Allow: /images/

nothing more in file

When I use the WebmasterWorld robots text validator I get line after line like these:

ERRORInvalid line:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd">
3ERRORInvalid fieldname:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
5ERRORInvalid line:
<head>

It is reading the HTML text of my index page. What is going on?

This is in the head section of the page:

<meta name="robots" content="index, follow" />

Can't think of any more info that might be relevant.

Thanks much.

encyclo

2:32 pm on Mar 10, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Are you sure you're specifying the full path to the robots.txt file?

http://www.example.com/robots.txt

Also check you're not specifying capitals (eg. Robots.txt) or you will get a 404 not found error.

Clair

7:11 pm on Mar 10, 2007 (gmt 0)

10+ Year Member



duh --
I wasn't giving the full URL-- for hours I'd been typing in just my domain name and that of a plagiarizer I'm in the process of informing on and just kept doing it.

Sorry to have dirtied up the list with an idiot post!

But thanks.