Welcome to WebmasterWorld Guest from 54.197.15.196

Forum Moderators: goodroi

robots.txt: Disallow: /links.htm

I want to disallow "links.htm"

   
1:06 pm on Jan 26, 2010 (gmt 0)

5+ Year Member



I want to disallow "links.htm" but allow "links.html".

With "Disallow: /links.htm" in robots.txt both "links.htm" and "links.html" are disallowed. How should my robots.txt look like ?

2:30 pm on Jan 26, 2010 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



You can add an "Allow" for "/links.html" after the "Disallow" for "/links.htm", but this will only work for the major search engines which support the "Allow:" extension to the Standard for Robot Exclusion. Many search engines don't support this extension.

If you want a solution that works for all robots, then change the name of one of these pages so that a prefix-match no longer results in a "collision" between the two names.

The prefix-matching behavior of robots.txt must be taken into account when naming resources and directories -- along with access control, cache-control, HTTP protocol requirements (naming restrictions), maintainer privilege levels (Who in your organization has access to maintain which directories?), server performance, site organization, and SEO considerations. Picking a good "name" (a URL) for a resource is not something that should or can be done instantly -- it requires some careful consideration.

Jim

10:34 am on Feb 2, 2010 (gmt 0)

5+ Year Member



Alternatively you could put a 'noindex,nofollow' meta tag in the links.htm file.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month