Welcome to WebmasterWorld Guest from 107.21.163.40

Forum Moderators: goodroi

Message Too Old, No Replies

Why is WebmasterWorld robot validator giving back HTML

     
10:50 am on Mar 10, 2007 (gmt 0)

Junior Member

5+ Year Member

joined:Jan 29, 2007
posts:123
votes: 0


2 sections in my robots.txt file:

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/ (rest of section just goes on with other dirs)

# allow google image bot to search all images
User-agent: Googlebot-Image
Disallow:
Allow: /*.gif$
Allow: /*.png$
Allow: /*.jpeg$
Allow: /*.jpg$
Allow: /*.ico$
Allow: /*.jpg$
Allow: /images/

nothing more in file

When I use the WebmasterWorld robots text validator I get line after line like these:

ERRORInvalid line:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd">
3ERRORInvalid fieldname:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
5ERRORInvalid line:
<head>

It is reading the HTML text of my index page. What is going on?

This is in the head section of the page:

<meta name="robots" content="index, follow" />

Can't think of any more info that might be relevant.

Thanks much.

2:32 pm on Mar 10, 2007 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 31, 2003
posts:9068
votes: 4


Are you sure you're specifying the full path to the robots.txt file?

http://www.example.com/robots.txt

Also check you're not specifying capitals (eg. Robots.txt) or you will get a 404 not found error.

7:11 pm on Mar 10, 2007 (gmt 0)

Junior Member

5+ Year Member

joined:Jan 29, 2007
posts:123
votes: 0


duh --
I wasn't giving the full URL-- for hours I'd been typing in just my domain name and that of a plagiarizer I'm in the process of informing on and just kept doing it.

Sorry to have dirtied up the list with an idiot post!

But thanks.