homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

Meta Tags & Robots.txt

10+ Year Member

Msg#: 383 posted 1:03 am on May 13, 2004 (gmt 0)

So, if I use a robots.txt file, then do I need to include the:

meta name="robots" contents="whatever"

statement in my html page?



WebmasterWorld Senior Member 10+ Year Member

Msg#: 383 posted 1:21 am on May 13, 2004 (gmt 0)

Depends what you want the metatag to do.

If it's simply mirroring a "dont crawl this file" command you are handling via robots.txt then you can remove it. However if you are using the metatag to issue a more advanced command that's not available from robots.txt (noarchive, nofollow etc) then you'll need to keep it because you won't be able to achieve that functionality otherwise.

- Tony


10+ Year Member

Msg#: 383 posted 1:37 am on May 13, 2004 (gmt 0)


I'm actually using the most of the robots.txt file from webmasterworld.com where I am disallowing bad spiders, allowing good ones by placing a * for the user-agent and blank for disallow after all the bad spiders, then disallowing certain directories of my site.

In my html pages, all but one have the meta name for follow, index. One page has a noindex, nofollow.

The meta name has been there a while, now I just added the robots.txt. So with what I have in the .txt file, are the meta names still necessary....


WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member

Msg#: 383 posted 1:57 am on May 13, 2004 (gmt 0)

robots.txt takes precedence over on-page meta robots tags, because if a page is disallowed in robots.txt, a good robot won't fetch the page, and so can't read the meta tags.

Be aware that a page disallowed in robots.txt may still appear in some search results if a search engine finds a link to it. It will appear as a URL-only listing, with no title and no description. This is not true of all engines, but Google, Yahoo, and Ask Jeeves have been observed with this behaviour. They don't fetch the page, they just show the link they found in the results.

Further, I noticed recently that Yahoo is now showing such links using whatever link text they found on the link as the title for their result.

The solution to the problem (if it concerns you) is to 'Allow' that page in robots.txt, and use the on-page noindex tag to keep it from being listed in search results. This costs you bandwidth, since you have to let the robot read the page.

Anyway, things will be clearer if you remember that the on-page meta robots tags can't be read if the page is disallowed in robots.txt.


Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved