Forum Moderators: phranque

Message Too Old, No Replies

htaccess slightly misunderstood

cleaning up errors

         

chewy

8:29 pm on Jun 21, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK so somewhere back in time, in my htaccess file I accidentally redirected a call to index.html to the non-www root/index.htm page.

All other pages are normally redirected to www.domain.com correctly.

This presents duplicate page problems, right?

The SERPS show the home page as non-www; with all other pages correctly in www format.

So I've corrected the htaccess file to now read

Redirect permanent /index.html [domain.com...]

I suspect I need to do more, right?

Back in a prior thread (http://www.webmasterworld.com/apache/3009363.htm)

this looks to be the thing that will both resolve the www - non-www redirection, and redirect calls to index.html back to the root.

RewriteEngine on
RewriteCond %{HTTP_HOST}!^www\.example\.com
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . http://www.example.com%1/%2 [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /(([^/]+/)*)index\.html\ HTTP/
RewriteRule index\.html$ http://www.example.com/%1 [R=301,L]\

Am I headed somewhat in the right direction here?

Patching this into my .htaccess file causes some things to work, others not.

jdMorgan

9:10 pm on Jun 21, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Put your rules in most-specific to least-specific order, so as not to cause multiple redirects (for example, in the case where http://example.com/index.html is requested).

RewriteEngine on
#
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /(([^/]+/)*)index\.html\ HTTP/
RewriteRule index\.html$ http://www.example.com/%1 [R=301,L]
#
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . http://www.example.com%1/%2 [R=301,L]
#
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

I don't see any obvious problems and you didn't provide any details, so I'm not sure why it might "cause some things to work, others not" except for the rule order problem as noted.

Jim

[edited by: jdMorgan at 2:26 am (utc) on June 22, 2008]

chewy

1:56 am on Jun 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



changed the multiple obvious "example" to my own domain, put it into .htaccess and all I get is a 500 error on all attempts.

Remove it and all is well.

Keep in mind there are about a dozen permanent redirects that remain in the htaccess file below the "rewrite" stuff.

I'll try a few things more - but what are some of the general ways to go about troubleshooting this?

jdMorgan

2:25 am on Jun 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You are being far too brief with your descriptions of problems. If you are using mod_alias directives for these "permanent redirects below the mod_rewrite code," then be aware that the order of code makes no difference: mod_rewrite processes the file and executes all of the directives it understands, and then mod_alias will process the file, again executing all of the directives that it understands. Or maybe mod_alias will go first -- It depends on your server setup. But the order you type directives handled by different modules in your code makes no difference, because the modules execute one at a time, in the order set by the server config.

Troublshooting:
1) Comment-out lines or sections by prefixing a "#" to the lines to isolate problems.
2) If you get a 500-Server Error, then look at your server error log file -- It will often tell you exactly what is wrong.

If you still need help after that, then please post the entire set of rewrites and redirects -- mod_alias and mod_rewrite both.

Jim

[edited by: jdMorgan at 2:26 am (utc) on June 22, 2008]

chewy

2:57 am on Jun 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



i am being brief in part due to the public forum, also because I am pushing my understanding (correction -- over my freaking head!) and don't know how to tell you what you need to know other than sending you the file via stickymail or something.

here's the many times repeated line from the server error log:

/var/www/html/.htaccess: RewriteRule: bad flag delimiters

jdMorgan

3:13 pm on Jun 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"Bad flag delimiters" usually indicates either an un-escaped space in the line, or something to do with the flags. The "flag delimiters" are the "[]" characters around the RewriteCond and RewriteRule flags, and the space character that precedes them. Note the extraneous trailing slash on the last rule you posted above -- That could trigger this error.

In case it hasn't already become clear, the directive-line parsers in mod_rewrite (and other Apache modules) are written to be simple and therefore very fast. As a result, they are absolutely unforgiving of syntax errors, and can be cryptic in their complaints. :)

Jim

chewy

5:41 pm on Jun 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK, excellent, there is progress!

The removal of the trailing / of the last line in the first post did in fact resolve the "bad flag delimiter" error.

Typing in the non-www url, does indeed return the www.example.com as it now should (and never did before) so that is good progress.

What I still think I misunderstand (or may have miscommunicated) is the whole idea of trying to get the index.htm page to NOT show up in the url (thus avoiding duplicate page problems).

Like when I type in amazon.com, it loads amazon.com/, not visually redirecting to an index.htm page.

Virtually all of my inbound links are to the www.example.com/ not the index.htm page, so when I type in www.example.com, shouldn't this return www.example.com/ and not www.example.com/index.htm?

perhaps there is something else going on?

Without these 2 lines, it loads a 404 when I try to view the root or the index.html page (but of course, index.htm works fine)

Redirect permanent /index.html http://www.example.com/index.htm
Redirect permanent /index.php http://www.example.com/index.htm

thanks,

-C

jdMorgan

5:52 pm on Jun 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Those two lines are "exposing" your "index.html" in search engines, etc. An external redirect (as opposed to an internal rewrite) tells the client to change the URL. As a result the index.htm URL will show in the address bar, and the search engines will discard index.html and index.php and show index.htm in the search results.

Remove them and replace with


DirectoryIndex index.htm

The listed index filepath should be the real file that exists as your index page, whether it is .htm, .html, .php, etc.

See Apache mod_dir for more info.

Then change the first rule I recently posted above to:


RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.(html?¦php)\ HTTP/
RewriteRule ^(([^/]+/)*)index\.(html?¦php)$ http://www.example.com/$1 [R=301,L]

Change the broken pipe "¦" characters in this code to solid pipe characters before use; Posting on this forum modifies the pipe characters.

Jim

chewy

6:18 pm on Jun 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



amazing and this is great.

This works - and I hope others can learn from my mistakes.

thanks a million!

-C

chewy

12:04 am on Jun 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Jim,

Just now I learned about rex swain's http viewer and double checked to see if the www and non-www were rendering correctly.

www renders a 200

non-www renders a 301

-

when I remove the .htaccess code, both render a 200 code.

do I understand now that I don't need the non-www to www redirect?

if that is correct, is there any damage done by leaving it as it is?

-C

jdMorgan

3:58 pm on Jun 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A 200-OK response means that the server *did not* redirect the non-canonical domain to the canonical domain. This will result in duplicate content, since both hostnames return a 200-OK.

A 200 for www and a 301 for non-www is the correct server response scenario. See Hypertext Transfer Protocol -- HTTP/1.1 [w3.org]

You may also want to consider using the "Live HTTP Headers" add-on for Firefox/Mozilla browsers, since it's more convenient than using a third-party site.

Jim

chewy

4:10 pm on Jun 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



this answers my question and I stand corrected. Good.

of course - another question arises...

I find many sites where this appears to be needed.

In a weeks time, there have been no noticeable changes on this current site, although ranking seems slightly better but that's hard to pin down - but I anticipate, if there are any changes at all, they may take a bit longer.

With a site that has about a dozen pages and about 300 backlinks, what do you guess (or experience) might be normal in terms of traffic increases?

jdMorgan

7:53 pm on Jun 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There may be none, or only a slight increase. The point is that you've now focused all backlinks on a single domain, and hardened your site against some tricks that competitors might try to use to dilute the ranking of your pages. This is no "magic" revenue-enhancer!

Also, while search results are returned instantly, every thing else takes days, weeks, or months. Wait 30 days before checking again, and check again 60 days after than... Only very-high-PageRank sites see change within days.

Jim

g1smd

7:51 pm on Jun 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This redirect is quite important. It was a good step to take.

I would next run your site through Xenu LinkSleuth and make sure that all of your internal linking is 100% functional.

Oh, and another vote for the LiveHTTPHeaders extension for Mozilla Seamonkey and for Mozilla Firefox.

chewy

8:25 pm on Jun 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I am a big fan of Xenu (dot exe), our little alien friend!

Thanks for the reminder - it caused me to see where I had missed a couple of pages.

(If only other tools were so simple and complete)

Feel free to suggest anything else I shouldn't leave out!

I've used Xenu and recommended it for years - I forget where I first heard of it but I wouldn't be surprised if it was here!

I expect to be using this htaccess file again and again!

I just hope I can see a difference in my stats as I don't want to suggest / recommend it without being able to know personally about what it can do.

chewy

6:31 pm on Jul 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Using the same code discussed above, on a different Apache server, I'm seeing some odd results. Initially it seems to redirect non www to www correctly, and checks out in Rex's header checker tool.

However, after a few page views, FireFox and IE both indicate the page cannot load. It is as if it works fine and then stops working over the space of about 1 minute.

FF generates the following error message:

The page isn't redirecting properly.

Firefox has detected that the server is redirecting the request for this address in a way that will never complete.

* This problem can sometimes be caused by disabling or refusing to accept
cookies.

===

This is on a v-host which means I don't have access to logfiles.

Pls advise what are best next steps for troubleshooting.

-C

PS .htaccess in discussion is:

RewriteEngine on
#
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.(html?¦php)\ HTTP/
RewriteRule index\.html$ http://www.example.com/%1 [R=301,L]
#
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . http://www.example.com%1/%2 [R=301,L]
#
RewriteCond %{HTTP_HOST} !^www\.example.com\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
DirectoryIndex index.html

g1smd

8:08 pm on Jul 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



One thing, your first line tests for index.html and .htm and .php but the second line only tests for index.html. I guess that is an error.

jdMorgan

9:20 pm on Jul 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If the correcting the error spotted by g1smd does not cure the problem, then please provide a specific URL which triggers the redirection loop -- changing only the domain to "example.com" (note also that you've got two consecutive ".com" strings in your third rule's RewriteCond -- another error, but probably just a posting error).

Also, if this redirection loop is visible to and reported by Firefox, install the "Live HTTP Headers" add-on -- It gives a much more detailed HTTP Headers view, and the first two redirect responses it logs when your rules are looping will probably clearly illustrate what is going on.

Jim

chewy

10:07 pm on Jul 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Err...

more like a classic newbie error than a posting error if you ask me.

a nice bike ride and some sage advice from the nice guys at WebmasterWorld made the difference.

all fixed.

enjoy your weekend!

-C

g1smd

10:29 pm on Jul 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It's almost always a simple typo or misunderstanding.

Been there, done that. Many times.

jdMorgan

2:34 pm on Jul 27, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



When I got my first 500-Server error as the result of my first try at mod_rewrite, I thought, "Oh great -- Only 499 more errors to go 'til I figure this out." :)

Jim