homepage Welcome to WebmasterWorld Guest from 54.205.99.71
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Robots.txt and htm - html
is disallow for htm killing pages that are html?
chewy




msg:1527301
 10:27 pm on Feb 27, 2006 (gmt 0)

I've got a system where we have all .htm pages marked with noindex / nofollow, and just to make sure, we also used a disallow:

User-agent: *
Disallow: /*.htm

The reason is that we use .htm to track PPC traffic etc, and all .html pages are organic.

In a pinch, this works. Been doing it for years. I don't reccomend it if you have a better way, but for us, this works.

Is the disallow potentially going to kill the spiders for html pages?

-c

 

Lord Majestic




msg:1527302
 10:35 pm on Feb 27, 2006 (gmt 0)

Your disallow statement is not correct - wildcards are not allowed, though supported by Googlebot.

If you list all .HTM files then you will automatically disallow .HTML ones too because of the way url matching works.

chewy




msg:1527303
 6:22 am on Feb 28, 2006 (gmt 0)

hoo boy -- what would you suggest?

robot.txt removed.

perhaps this explains why things are looking so flooey...

cabowabo




msg:1527304
 4:51 pm on Mar 7, 2006 (gmt 0)

For best results, put the disallow in your HTAccess file instead. You have more control and no one can see what you are doing.

Cheers,

CaboWabo

Madx




msg:1527305
 6:37 pm on Mar 8, 2006 (gmt 0)

Both G and MSN accepts

Disallow: /*.htm$

Using this should be the end of htm-files at your websites, but not html.

I'd just read their guidelines, because I'm going from asp to aspx (so I haven't tested it)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved