Forum Moderators: open

Message Too Old, No Replies

Google won't index my .html pages

htm vs. html

         

webwoman

3:51 pm on May 3, 2003 (gmt 0)

10+ Year Member



I took over a site that had some .htm pages indexed and was basically nowhere in the SERPs. I have added quite a few pages to the site - all .html, but left the indexed .htm pages up so as not to confuse the bots. Then I made .html versions of the .htm pages and all the links are going to the .html pages. All the new pages are .html. At the first Google update (last month), I lost all (the few there were) backlinks on the site. Not sure if there is any relation here...and now I see that a few of the backlinks are back - but only the .htm pages. Any pages that show up in the SERPs are .htm only.

I had the deepbot in last month (16th) and the freshbot shows up fairly regularly. Any advice? Did I screw up? Should I even care if the site is .htm or .html?

dazz

4:11 pm on May 3, 2003 (gmt 0)

10+ Year Member



If you have been 'deepbotted' you should be fine and your new pages will show up on the next update.

dmjw01

4:56 pm on May 3, 2003 (gmt 0)

10+ Year Member



Sounds like what you really want to do is set up a permanent redirect. This will return an http "301" response, which will tell Googlebot that the page has permanently been replaced by the new page - it seems to handle this perfectly okay with no loss of PR or SERPS positioning.

If your server is running Apache you can do this using the ".htaccess" file. You need to add the following line for each file (all on one line)...

RedirectPermanent /widget.htm [yoursite.com...]

Note that the first argument to "RedirectPermanent" is the 'old' name of the file, and the second argument is the complete URL of the new file. (Obviously, I've just invented the URL above - you need to use your own site's address.)

Once you've done this (and checked that it works!), you should REMOVE the old files completely. Leaving them there is actually confusing Googlebot - it's probably ignoring the new files because they're just duplicates.

Actually, it might be possible to use wildcards so that ALL .htm files are redirected to .html - but I'm not enough of a .htaccess expert to be sure of this.

Hope this helps...

webwoman

5:00 pm on May 3, 2003 (gmt 0)

10+ Year Member



It sure does, dmjw, thanks very much. And to dazz - yes, I have my fingers crossed that this will be the case.

Does it actually matter whether an entire site is htm or html? I wanted to convert the whole thing to html when I did this, but not sure that I had a good reason for it.

rfgdxm1

6:00 pm on May 3, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



.htm or .html should make no difference. However, if you duplicate content at both this can cause obvious problems.