Forum Moderators: phranque

Message Too Old, No Replies

301 redirecting index.html to /

         

keyplyr

6:36 pm on Sep 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Any ill effects if I use this?

Redirect 301 /index.html http*//www.my_site.com/

(* used to de-link)

nancyb

6:56 pm on Sep 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have only one site to check this against but, gbot wasn't crawling my subdirectory index pages very often. After changing some of the links to mydomain/sub/index.htm gbot is visiting them several times a month now.

I also changed most of the mydomain/ links to mydomain/index.htm and that is visited much more since the change.

keyplyr

7:40 pm on Sep 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




Thanks Nancyb. My interest in redirecting index.html is for different reasons, and I only use subdirectory/index.html as a stop-page to prevent directory browsing.

I want to know if this mod redirect will come back to bite me on the @ss in some other mysterious way.

jdMorgan

7:47 pm on Sep 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



keyplyr,

> Any ill effects if I use this?
> Redirect 301 /index.html http*//www.my_site.com/

Well, you could put your server into a loop at worst. If index.html is requested, you'll redirect from index.html to "/". A 301 response will be sent back to the client saying "request that from '/'." So, the client will then request "/", and then DirectoryIndex (Apache mod_dir [httpd.apache.org]) kicks in and redirects that to index.html. IF there are any subrequests associated with fetching index.html, then the Redirect 301 (mod_alias) [httpd.apache.org] will kick in again, and the whole process starts over. Then you've got a loop.

You can get around this by renaming index.html to anything else -- like home.html, default.html, or index.htm - It just has to be different from the original existing filename. If you stray from the standard "home page" names defined in your server configuration, you may have to add a DirectoryIndex directive:


DirectoryIndex index.html index.htm whatever.html

This defines the possible index page names, and the order to check for them. Make sure the filename you are trying to redirect from is not in that list, and that the file you want served in response to requests for "/" *is* in the list.

Jim

keyplyr

8:50 pm on Sep 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi Jim, thanks.

My interest in this is that Googlebot is always requesting:

http*//my-domain.com/index.htmlrobots.txt

This obviously causes errors. Nowhere on my website is there a link (or even mention of) index.html, so I assume Googlebot is following remote links for index.html. Similarly, I have also seen errors which appear that users are typing page names after index.html:

http*//my-domain.com/index.htmlpage_name

I'm thinking that if all requests for http*//my-domain.com/index.html would redirect to http*//my-domain.com/ then this would solve the problem.

But, as you say, I risk the danger of the dreaded 'loop' (I shudder to even think of it!)
I wonder if there is a logical solution to this ongoing issue?

jdMorgan

2:34 am on Sep 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



keyplyr,

You might want to go through your pages, and make sure you don't have a misconfigured base href tag in the <head> section somewhere. This sounds like a really odd problem.

However, I think everything you need to avoid the loop is in my previous post. Besides, your case is even simpler, since you would not be redirecting "index.html" to "/" you would be redirecting ^index\.html(.+)$ to [yourdomain.com...] which is not going to be loop-prone.

Jim

<added>Use RedirectMatch 301 so you can do the backreference.</added>

keyplyr

3:37 am on Sep 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks Jim, but shouldn't I add a preceeding slash to the code so sub-dirtectory index pages do not redirect also? Example:

RedirectMatch 301 ^/index\.html(.+)$ [domain.com...]

<added> I don't use base href </added>

<added><added> Tried both ways - neither one does anything at all. Must be the server config, which is most likely the culprit anyway! </added></added>

plumsauce

6:23 am on Sep 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




I have only one site to check this against but, gbot wasn't crawling my subdirectory index pages very often. After changing some of the links to mydomain/sub/index.htm gbot is visiting them several times a month now.
I also changed most of the mydomain/ links to mydomain/index.htm and that is visited much more since the change.

anyone else seeing similar behaviour?

/ seems so much cleaner, hate to switch back to .htm

++++

nancyb

7:01 am on Sep 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



plumsauce,

I mentioned the / vs /index.htm because I misunderstood keyplyr's reason.

I would also be interested if anyone else has noticed this, but it would probably get more response in the google news forum.

maybe a mod could relocate your question there?

keyplyr

7:47 am on Sep 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



plumsauce, nancyb

Actually, there are a couple threads from times past about this subject. You might try the site search utility at the top of the page.

As I remember, most webmasters felt that Google did it's own redirecting with default pages, and that it really didn't matter whether the mark-up used relative (page.html) or full (mydomain.com/page.html) URLs, being that once the robot is in your domain, it pulls all the files in the same manner.

However, I think a few webmasters preferred full URLs for various scenarios.

I know that I've used both methods in the past and I personally do not think it matters either way, opting for the shorter succinct method.

nancyb

1:42 pm on Sep 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



thanks keyplyr,

I have read those posts and I also opted for the shorter method last year. But when I saw gbot visiting the sub index pages a lot less frequently after changing to the short version(although there didn't seem to be a change in the frequency of the other sub directory pages), I changed some of them back again to the full url.

It's just been a little over a month, so can't really judge if it is coincidence or a change of gbot behavior due to my changes - and - of course, I know I won't really know anyway :( but it is still nice to see gbot visiting those pages more often - I think.

jdMorgan

2:01 pm on Sep 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



keyplyr,

> but shouldn't I add a preceeding slash to the code so sub-dirtectory index pages do not redirect also?

Yes, you should. What I posted is mod_rewrite syntax for use in .htaccess - force of habit. :)

Jim

plumsauce

10:48 pm on Sep 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




nancyb,

i asked about your specific observations because
it parallels a situation that i am looking at.

as for moving the post to google news, there are
already a couple of threads running there on this
topic which i am tracking or have posted to.

++++