| Welcome to WebmasterWorld Guest from 188.8.131.52 |
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
|Become a Pro Member|
|Robots.txt and htm - html|
is disallow for htm killing pages that are html?
I've got a system where we have all .htm pages marked with noindex / nofollow, and just to make sure, we also used a disallow:
The reason is that we use .htm to track PPC traffic etc, and all .html pages are organic.
In a pinch, this works. Been doing it for years. I don't reccomend it if you have a better way, but for us, this works.
Is the disallow potentially going to kill the spiders for html pages?
Your disallow statement is not correct - wildcards are not allowed, though supported by Googlebot.
If you list all .HTM files then you will automatically disallow .HTML ones too because of the way url matching works.
hoo boy -- what would you suggest?
perhaps this explains why things are looking so flooey...
For best results, put the disallow in your HTAccess file instead. You have more control and no one can see what you are doing.
Both G and MSN accepts
Using this should be the end of htm-files at your websites, but not html.
I'd just read their guidelines, because I'm going from asp to aspx (so I haven't tested it)
All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved