homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

Robots.txt quick question

10+ Year Member

Msg#: 4251022 posted 4:59 pm on Jan 9, 2011 (gmt 0)


We have a bunch of 404 errors from Webmaster Tools that sit in a directory that has thousands of other valid URLs.

We want to disallow these 404 URLs without affecting the others in the directory like so:

404 error URL: /xyz/keyword/abc/
Valid URL: /xyz/keyword 1/abc/

If I disallow using:

Disallow: /xyz/keyword/abc/

will it disallow the URL above and continue to crawl all other URLs in the /xyz/ or will it try to disallow all of them?

Thanks in advance for your help.



WebmasterWorld Senior Member tangor us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

Msg#: 4251022 posted 5:12 am on Jan 10, 2011 (gmt 0)

This sounds like something better handled via .htaccess. Robots.txt is not going to help against those SEs which already found the URIs as they will keep hitting their already collected info to see it the link still exists. Feed it a 410 (gone), 404 (not found, ie, do nothing) or 301 to the page you want the SE to find.

This assumes that the bad URIs are NOT THERE. If they are, then WHY?

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved