Forum Moderators: phranque

Message Too Old, No Replies

extensionless site with custom 404 or missing problem

getting a lot of 404s for missing page

         

Lorel

7:32 pm on Dec 10, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have a site where I set up extensionless ULRs. The urls with extension are on all links but I have a redirect in htaccess to switch to extensionless. This works fine.

However I keep seeing a lot of 404s in Google search console involving the custom 404 missing page.

I set up the htaccess file like this:
ErrorDocument 404 https://example.com/missing.html

I also see 302's in access logs involving the missing page like this:

40.77.167.4 - - [10/Dec/2020:09:56:37 -0700] "GET /company-history/ HTTP/1.1" 302 219 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
40.77.167.4 - - [10/Dec/2020:09:56:37 -0700] "GET /missing.html HTTP/1.1" 301 238 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"

Can someone tell me if I need to change something?

not2easy

8:03 pm on Dec 10, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



The error document should not list the entire URL, it should only have the page name, not the full URL.
ErrorDocument 404 /missing.html


BTW - this is only for your .htaccess file, not related to css.

tangor

5:26 pm on Dec 11, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Second above, leave the protocol off!

Lorel

5:38 pm on Dec 11, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I left off the url like this:

ErrorDocument 404 missing.html

and tested it and a page came up with:

missing.html - nothing else on the page so I put it back the way it was.

not2easy

8:18 pm on Dec 11, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Well, missing.html has to exist, and needs to be in the root (home) directory. You need to have a custom 404 page to use one.

Your use of the full URL will cause a bunch of 302s as you noted in the OP.

Lorel

8:51 pm on Dec 11, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



yes the missing.html is in the root and working properly as long as the full url is used.

Can someone tell me what might be wrong.

not2easy

10:26 pm on Dec 11, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



missing.html - nothing else on the page so I put it back the way it was.
You do not say either how you tested it or what you have in your page that did not show. If you entered a fictional URL in your browser and it served your "missing.html" page and there was nothing on the page it is because you have not actually created a custom html page named missing.html. The server will serve the "missing.html" page for a 404 error, but the server does not create the page or add anything to whatever you have created.

Is this hosted on a typical Apache server and do you have the Error Document lines in your .htaccess file? Have you created a html page and named it "missing.html"? If that page is served but has no content, it is because you have not added any content to your page.

It is entirely possible that I am missing something in this question, but my response is based on what I am reading. You do not want to use a full URL for an error document, it will cause errors.

phranque

12:34 am on Dec 12, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



see the apache ErrorDocument Directive documentation here:
https://httpd.apache.org/docs/current/mod/core.html#errordocument


I set up the htaccess file like this:
ErrorDocument 404 https://example.com/missing.html

I also see 302's in access logs involving the missing page like this:

when you specify an "external" url:
Note that when you specify an ErrorDocument that points to a remote URL (ie. anything with a method such as http in front of it), Apache HTTP Server will send a redirect to the client to tell it where to find the document, even if the document ends up being on the same server. This has several implications, the most important being that the client will not receive the original error status code, but instead will receive a redirect status code. This in turn can confuse web robots and other clients which try to determine if a URL is valid using the status code.
(from Apache Core Features documentation)


I left off the url like this:

ErrorDocument 404 missing.html

and tested it and a page came up with:

missing.html - nothing else on the page so I put it back the way it was.

with your directive coded like this, it looks like you asked apache to:
output a simple hardcoded error message
(from Apache Core Features documentation)

note the version suggested by not2easy:
ErrorDocument 404 /missing.html

this takes advantage of the proper syntax to specify a local web-path:
URLs can begin with a slash (/) for local web-paths (relative to the DocumentRoot), or be a full URL which the client can resolve.
(from Apache Core Features documentation)

tangor

9:05 am on Dec 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



ErrorDocument 404 missing.html
should be
ErrorDocument 404 /missing.html

missing.html MUST be in root of your www or public or httpdocs or wherever your net facing files are located by default (where your example.com/index.html file is located)

for proof, edit the content of missing.html
to read:
<p>Heck yeah, I made this custom 404!</p>

Type http://example.com/xyz.123 into your browser

If you don't see
Heck yeah, I made this custom 404!

Then something is very wrong and you have more problems to address!

More importantly, if you have addressed this issue and got it working, please let us know! :) This also helps future members and visitors to this thread to know there was a resolution!

Lorel

7:01 pm on Dec 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks everyone. After removing the full url I had left off the "/". Now it looks like this and is working:

ErrorDocument 404 /missing.html

BTW, I have noticed that anytime there is a site with documents inside folders that the missing file doesn't work unless the full url is used.

Do I need to set up a separate htaccess for the custom missing page in each folder for it to work properly?

not2easy

2:24 am on Dec 14, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Do I need to set up a separate htaccess for the custom missing page in each folder for it to work properly?

If the content of the folders are static html pages, the missing.html error document should be shown for 404 errors in all folders.

BUT#1. If the folders 'content is created dynamically such as a site using a database (such as WordPress) then it will often generate generate its own 404 error page - which can be customized.

BUT #2. There may be settings in .htaccess that can prevent inheriting directives, especially where an additional .htaccess file is used in the folders below root.

So the answer is maybe.

phranque

5:25 am on Dec 14, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



BTW, I have noticed that anytime there is a site with documents inside folders that the missing file doesn't work unless the full url is used.

please describe in technical terms what "missing file" and "doesn't work" and "full url" each mean.
Do I need to set up a separate htaccess for the custom missing page in each folder for it to work properly?

i assume by "custom missing page" you mean the custom 404 error document.
if you specify a url that "begin(s) with a slash (/) for local web-paths (relative to the DocumentRoot)" then it should work for missing files from any folder.
There may be settings in .htaccess that can prevent inheriting directives, especially where an additional .htaccess file is used in the folders below root.

or settings that cause the .htaccess file to be completely ignored.