Forum Moderators: phranque
[edited by: phranque at 11:13 am (utc) on Jun 18, 2013]
[edit reason] Please Use Example.com [webmasterworld.com] [/edit]
[edited by: phranque at 11:23 pm (utc) on Jun 18, 2013]
[edit reason] unlinked urls [/edit]
# Redirect index in any directory to root of that directory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index(\.[a-z0-9]+)?[^\ ]*\ HTTP/
RewriteRule ^(([^/]+/)*)index(\.[a-z0-9]+)?$ http://www.example.com/$1? [R=301,L]
...and it seems to be working on the whole. Have I done this correctly? I'd like a little reassurance here! :)
RewriteRule ^(.*/)?index\.html?$ /$1 [R=301,L] [edited by: Dideved at 3:48 pm (utc) on Jun 18, 2013]
[edited by: phranque at 11:24 pm (utc) on Jun 18, 2013]
[edit reason] unlinked url [/edit]
Do I have to include 'RewriteEngine on'
Do I need to add anything else afterwards, to switch it off?!
However, it doesn't work for the subdomains+index.html (including redirected subdomains as well) in exactly the same as before, i.e. it adds their folders to the end of the url, and therefore then returns an error.
[edited by: phranque at 11:25 pm (utc) on Jun 18, 2013]
ErrorDocument 400 http://www.example.com/errors/error.html
ErrorDocument 401 http://www.example.com/errors/error.html
ErrorDocument 403 http://www.example.com/errors/error.html
ErrorDocument 404 http://www.example.com/errors/error.html
ErrorDocument 410 http://www.example.com/errors/error.html
ErrorDocument 500 http://www.example.com/errors/error.html
ErrorDocument 400 /errors/error.html
...
etc Note that when you specify an ErrorDocument that points to a remote URL (ie. anything with a method such as http in front of it), Apache HTTP Server will send a redirect to the client to tell it where to find the document, even if the document ends up being on the same server. This has several implications, the most important being that the client will not receive the original error status code, but instead will receive a redirect status code. This in turn can confuse web robots and other clients which try to determine if a URL is valid using the status code.
You'll also get a range of opinions on whether the replacement URL should include the full host name. The reason why you might want to include the host name is for performance. If someone visits an index.html page through a non-canonical host name, then they will likely go through two redirect hops, one to fix the host name and one to strip off index.html. But if you include the full host name in the index.html replacement, then you can accomplish both at once.
However, the reason why you might *not* want to include the host name in the replacement is for ease of maintenance. The canonical host name need only be defined once, plus this rule above can work correctly across multiple host names. And since it sounds like you have multiple host names in play, this may be a significant factor for you.
this is now happening when index.html is added to the end of the subdomain before it redirects. Only then.
basically you will lose about 15% of PR through a single redirect and closer to 28% if you use 2 redirects to get there.
This is an example of the .htaccess file for the old subdomains, e.g. [bath.example.com...] :
-------------------
# Permanent URL redirect
Redirect 301 / http://www.example.com/europe/england/somerset/bath/
when there are Redirect and RewriteRule directives in the same scope, the RewriteRule directives will run first, regardless of the order of appearance in the configuration file.
This would be a valid concern, but I think we may have oversimplified what he said in the video. Matt starts off saying that 10-15% of PR is lost just by following a normal link. He then mentions how 301s used to lose less PR compared to normal links, then later they used to lose more PR. Today, he says, the amount of PR that dissipates through a 301 is identical to the amount of PR that dissipates through a link. He then explicitly says that it neither hurts nor helps to use a 301, and that we should use whatever is best for our purposes.
However, it doesn't work for the subdomains+index.html (including redirected subdomains as well) in exactly the same as before, i.e. it adds their folders to the end of the url, and therefore then returns an error.
http://www.example.com/europe/england/somerset/bath/index.html becomes
http://www.example.com/europe/england/somerset/bath/pages/bath/
(/pages/bath/ is the location of the old redirected subdomain in this example)
http://www.blog.example.com/index.html becomes:
http://www.example.com/blog/blogpages/
What should I do to prevent this?
when there are Redirect and RewriteRule directives in the same scope, the RewriteRule directives will run first, regardless of the order of appearance in the configuration file.
So what do you suggest that I do in this instance, so that I can always drop the index.html , regardless of folder structure and subdomains?
what is the response status chain for those requests?
it looks like you are either getting multiple redirects or an internal rewrite is being exposed by a subsequent external redirect.
[edited by: phranque at 10:13 am (utc) on Jun 19, 2013]
[edit reason] noise [/edit]
[edited by: phranque at 10:58 am (utc) on Jun 19, 2013]
[edit reason] noise [/edit]
[edited by: phranque at 10:59 am (utc) on Jun 19, 2013]
[edit reason] noise [/edit]
RewriteRule ^(.*)$ http://www\.example\.com/$1 [R=301,L]
RewriteRule (.*) http://www.example.com/$1 [R=301,L]