Welcome to WebmasterWorld Guest from 54.226.133.245

Forum Moderators: goodroi

Message Too Old, No Replies

Meta Tags & Robots.txt

     
1:03 am on May 13, 2004 (gmt 0)

New User

10+ Year Member

joined:Dec 26, 2003
posts:23
votes: 0


So, if I use a robots.txt file, then do I need to include the:

meta name="robots" contents="whatever"

statement in my html page?

1:21 am on May 13, 2004 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 25, 2002
posts:872
votes: 0


Depends what you want the metatag to do.

If it's simply mirroring a "dont crawl this file" command you are handling via robots.txt then you can remove it. However if you are using the metatag to issue a more advanced command that's not available from robots.txt (noarchive, nofollow etc) then you'll need to keep it because you won't be able to achieve that functionality otherwise.

- Tony

1:37 am on May 13, 2004 (gmt 0)

New User

10+ Year Member

joined:Dec 26, 2003
posts:23
votes: 0


Tony,

I'm actually using the most of the robots.txt file from webmasterworld.com where I am disallowing bad spiders, allowing good ones by placing a * for the user-agent and blank for disallow after all the bad spiders, then disallowing certain directories of my site.

In my html pages, all but one have the meta name for follow, index. One page has a noindex, nofollow.

The meta name has been there a while, now I just added the robots.txt. So with what I have in the .txt file, are the meta names still necessary....

1:57 am on May 13, 2004 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


robots.txt takes precedence over on-page meta robots tags, because if a page is disallowed in robots.txt, a good robot won't fetch the page, and so can't read the meta tags.

Be aware that a page disallowed in robots.txt may still appear in some search results if a search engine finds a link to it. It will appear as a URL-only listing, with no title and no description. This is not true of all engines, but Google, Yahoo, and Ask Jeeves have been observed with this behaviour. They don't fetch the page, they just show the link they found in the results.

Further, I noticed recently that Yahoo is now showing such links using whatever link text they found on the link as the title for their result.

The solution to the problem (if it concerns you) is to 'Allow' that page in robots.txt, and use the on-page noindex tag to keep it from being listed in search results. This costs you bandwidth, since you have to let the robot read the page.

Anyway, things will be clearer if you remember that the on-page meta robots tags can't be read if the page is disallowed in robots.txt.

Jim