Welcome to WebmasterWorld Guest from 3.226.251.81

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

404 error doc messed up

it's repeating domain name several times

     
9:05 pm on Sep 4, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:June 4, 2002
posts: 1908
votes: 3


I just noticed that the custom error document is causing a repeat of the domain name every time there is a typo in the page URL and also adding an extra "m" on the end of htm

i.e., https://www.example.com/www.example.com/www.example.com/www.example.com/www.example.com/missing.htmm

and every time I test it the repeat gets longer, i.e., currently at 16 repeats.

here is the 404 line:

ErrorDocument 404 https:www.example.com/missing.htm 
AddHandler server-parsed .htm


The last major change to the site was setting up https, however, I verified the changes with whynopadlock. The 404 missing page was working fine before that.

I assume this is some kind of loop so I have included the following:

Here is the index to root redirect:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.htm\ HTTP/
RewriteRule ^(([^/]+/)*)index\.htm$ https://www.example.com/$1 [R=301,L]


and here is the code to force https and also canonical and www redirect.

RewriteCond %{HTTP_HOST} example\.com [NC]
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.example.com/$1 [R,L]


can someone see anything wrong?

[edited by: phranque at 9:43 pm (utc) on Sep 4, 2018]
[edit reason] fix quote codes [/edit]

9:40 pm on Sept 4, 2018 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11771
votes: 224


try this:
ErrorDocument 404 /missing.htm


edited after being reminded by whitespace's following post

[edited by: phranque at 9:45 pm (utc) on Sep 4, 2018]

9:43 pm on Sept 4, 2018 (gmt 0)

Full Member

Top Contributors Of The Month

joined:Apr 11, 2015
posts:328
votes: 24


ErrorDocument 404 https:www.example.com/missing.htm


If that is intended to be an absolute URL then you are missing a double slash after the scheme. However, you shouldn't be using an absolute URL here anyway, since that will trigger a 302 to the error document, not the desired 404. You should use a root-relative path. For example:

ErrorDocument 404 /missing.htm


On an unrelated note, your "force https and also canonical and www redirect" rule block is also incorrect. The first condition will always be true (since you are checking for "example.com" anywhere in the hostname) and it only redirects when requesting on port 80, so it won't canonicalise a request for "https://example.com" (HTTPS and no www)

This should instead be something like:


RewriteCond %{HTTP_HOST} ^example\.com [NC,OR]
RewriteCond %{SERVER_PORT} 80
RewriteRule (.*) https://www.example.com/$1 [R,L]


This is also a 302 (temporary) redirect - change it to a 301 when you are done testing.
9:48 pm on Sept 4, 2018 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11771
votes: 224


RewriteCond %{HTTP_HOST} example\.com [NC]
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.example.com/$1 [R,L]

i would suggest this instead:
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$ [NC,OR]
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.example.com/$1 [R=301,L]
11:42 pm on Sept 4, 2018 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:4391
votes: 310


Apache defaults to a 302 response (temporarily moved) so without the [R=301 part of the flag, it is not considered a 301 (permanent) change.
I just thought that an explanation could be helpful.
7:06 pm on Sept 5, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:June 4, 2002
posts: 1908
votes: 3


I removed the full url on missing file and that is working ok now.

I tried the code that Phranque suggested for the canonical but it produced an error and wouldn't load the site.

I put the old code back in and the site is loading correctly.

I then added the [R=301,L] on the end and it's still ok.

Is there anything else I should change? It currently reads:

<quote>
RewriteCond %{HTTP_HOST} example\.com [NC]
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.example.com/$1 [R=301,L]
</quote>
9:51 pm on Sept 5, 2018 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11771
votes: 224


I tried the code that Phranque suggested for the canonical but it produced an error and wouldn't load the site.

what was the corresponding message in the server error log file?
11:51 pm on Sept 5, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:June 4, 2002
posts: 1908
votes: 3


@phranque sorry I don't remember. I had put the old code back in and didn't want to re do what you sent. Something about can't load this page.
12:42 am on Sept 6, 2018 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11771
votes: 224


your error log file disappeared?
1:46 am on Sept 6, 2018 (gmt 0)

Full Member

Top Contributors Of The Month

joined:Apr 11, 2015
posts: 328
votes: 24


phranque: i would suggest this instead:


RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$ [NC,OR]
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.example.com/$1 [R=301,L]



This should be OK, providing you don't use any other subdomains (or domains) that resolve to the same place. Since it redirects everything to "www.example.com", it doesn't simply canonicalise "example.com". This is not a problem with your original rule (or my suggestion above ;).

However, the first RewriteCond directive is a bit confusing/ambiguous-looking and should be simplified IMO. The first condition uses a negated pattern that is also entirely optional. (THEORETICAL BIT START...) If it's optional then it (potentially) matches an empty host, but the pattern is negated, so it's successful when the host is non-empty. For any legitimate request the host is always non-empty, so it's always successful, which would result in a redirect loop. (...THEORETICAL BIT END) However, it doesn't actually work like that, it will always try to make a positive match, so it will always match the hostname, rather than a non-empty hostname, so there is no redirect loop as it happens. However, neither will this match an empty host header (which I don't think is the intention). That "theoretical bit" was really just to highlight the "confusing/ambiguous-looking" nature of that expression. (TBH, this looks like a case of having incorrectly applied elements you often see used in a "positive expression" to a "negated expression".)

Since a negated condition is being used, you should really just be checking that the host is not "www.example.com" (as written, all lowercase). End of. So, this should be written as:


RewriteCond %{HTTP_HOST} !^www\.example\.com$ [OR]


And this is now successful when the host header is empty (HTTP 1.0 request). This contrasts with the opposite / positive expression, where you would make it optional and case-insensitive:


RewriteCond %{HTTP_HOST} ^(example\.com)?$ [NC,OR]


In this "positive expression", the host is either "example.com" (any case) or is empty... then redirect to "www.example.com".
2:31 am on Sept 6, 2018 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11771
votes: 224


This should be OK, providing you don't use any other subdomains (or domains) that resolve to the same place. Since it redirects everything to "www.example.com", it doesn't simply canonicalise "example.com". This is not a problem with your original rule (or my suggestion above).

i would have expected other hostnames in the configuration would be described in the problem statement as that would be an unusual condition.
the usual exceptions to this are the typical www subdomain and possibly wildcard subdomain configurations, both of which are properly handled by this:
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$ [NC,OR]

However, the first RewriteCond directive is a bit confusing/ambiguous-looking and should be simplified IMO. The first condition uses a negated pattern that is also entirely optional.

this means if the Host HTTP Request header isn't either blank or exactly the canonical hostname, case insensitive.
you will find this suggested code snippet described in hundreds of threads in this forum btw.
with some hosting services the Host header value will always be lower cased, making the [NC] flag unnecessary but it won't hurt.
if you are on shared hosting the Host header will likely be valuated for HTTP 1.0 requests that don't send a Host HTTP Request header, so that makes the "optionalizing" of the pattern also unnecessary but again it won't hurt.
3:51 am on Sept 6, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15753
votes: 826


For any legitimate request the host is always non-empty, so it's always successful
whitespace, look again. The negation means “if the host is NOT (exactly suchandsuch OR exactly nothing)”. That’s why the opening and closing anchors are essential.

I should point out that if you are on shared hosting, or in any situation using a <VirtualHost> envelope, the “or nothing” option is almost certainly superfluous, simply because requests without a Host: header will never reach your site in the first place. So you could shave three bytes from your htaccess with no ill effects. The [NC] flag is a whole nother matter for fruitful discussion...
4:34 pm on Sept 6, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:June 4, 2002
posts: 1908
votes: 3


@phranque sorry I missread your statement - didn't see "log file". I just downloaded the log file but don't have a log analyzer program to help me read it. I understand most of what is written there but have no idea which one to look for now (2 days later).

This site does not have any other domains or subdomains. All urls are in lower case. It is on shared hosting.

I'm really confused by the discussion above and have no idea which version to use now.

I tried this (as first line)
RewriteCond %{HTTP_HOST} ^(example\.com)?$ [NC]

and typed in url in the location bar without the www and it won't bring up the www version of the site.

If I use this:
RewriteCond %{HTTP_HOST} example\.com [NC]

It doesn't revert to www either.

nor this:
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$ [NC]

Also, it doesn't bring up the www version if I remove everything in location but the domain name.

Can someone help me rewrite this line?