homepage Welcome to WebmasterWorld Guest from 54.196.189.229
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Why is WebmasterWorld robot validator giving back HTML
Clair




msg:3277449
 10:50 am on Mar 10, 2007 (gmt 0)

2 sections in my robots.txt file:

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/ (rest of section just goes on with other dirs)

# allow google image bot to search all images
User-agent: Googlebot-Image
Disallow:
Allow: /*.gif$
Allow: /*.png$
Allow: /*.jpeg$
Allow: /*.jpg$
Allow: /*.ico$
Allow: /*.jpg$
Allow: /images/

nothing more in file

When I use the WebmasterWorld robots text validator I get line after line like these:

ERRORInvalid line:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd">
3ERRORInvalid fieldname:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
5ERRORInvalid line:
<head>

It is reading the HTML text of my index page. What is going on?

This is in the head section of the page:

<meta name="robots" content="index, follow" />

Can't think of any more info that might be relevant.

Thanks much.

 

encyclo




msg:3277556
 2:32 pm on Mar 10, 2007 (gmt 0)

Are you sure you're specifying the full path to the robots.txt file?

http://www.example.com/robots.txt

Also check you're not specifying capitals (eg. Robots.txt) or you will get a 404 not found error.

Clair




msg:3277766
 7:11 pm on Mar 10, 2007 (gmt 0)

duh --
I wasn't giving the full URL-- for hours I'd been typing in just my domain name and that of a plagiarizer I'm in the process of informing on and just kept doing it.

Sorry to have dirtied up the list with an idiot post!

But thanks.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved