Forum Moderators: open
My question is... the missing.html shows whenever the server is unable to find the requested page, it acts as a sort of custom 404. When a user accesses my site domain.com/nosuchpage.html, it displays the missing page, which is a copy of the home page, but it displays the non existant url in the address bar. Since there are several pages indexed and showing up as search results, will google see these pages and consider that they are duplicate pages? If so, will this cause PR problems for the rest of my site?
What about the possabilities of creating a dynamic missing page, so that it would be different everytime it's accessed?
Any info on the problems or benefits of using missing.html would be helpfull.
Thanks.
I've found that because the page "missing.html" itself isn't linked to from any other page then google just doesn't find it. It has been sitting in my account for about five months, has never been found by googlebot and isn't listed anywhere. If you're really worried you may also be able to block that page using robots.txt. I figure that there's never any reason for search engines to find my 404 page anyway as it's not real content and this removes any duplicate content problem.
You can check your server response for missing.html using the Server Header Checker here on WebmasterWorld. If it shows a 404 response code, then you'll be just fine with Google.
Look under Control Panel (at the top left of your screen), then click on the Server Headers link in the left-side nav bar on that page. Type in the URL to a non-existent page on your site, and check the resulting response code.
Jim
Thanks for the help, this makes me feel more comfortable, since it does contain a duplicate content as another actual page of my site.
For now, if Googlebot sees an
HTTP/1.1 404 Not Found
[or "HTTP/1.0 404 I Am Tired", it's "404" the important kw]
on the header of server response, it will stop in reading.
[At least, that is what I've understood watching my logs.
But don't trust this too much.
Maybe tomorrow Googlebot will go ahead, reading also the page content ;)]
cminblues
My question is... the missing.html shows whenever the server is unable to find the requested page, it acts as a sort of custom 404.
I'm assuming you're running apache; you haven't said, so forgive me otherwise.
You're quite right here. If you don't have an ErrorDocument statement in your .htaccess, then the Host has probably set this up in the server config files.
If it might make you feel better, you can have a per/directory missing.html (Actually it doesn't have to be missing.html, it can be anything you want to call it)
Just set up a separate .htaccess on each directory if you want to customize it (because if they requested that directory, the content might be more relevant)
Example, you have in yourdomain.com folders named dir1 and dir2:
So in the .htaccess in /dir1
ErrorDocument 404 custommissing1.html
and in .htaccess of /dir2
ErrorDocument 404 custommissing2.html
etc.
The custommissing.html's can be absolute if you'd like.
The advantage of doing this is that the user is 'closer' to the content that he/she was looking for, especially if you have a large site. I'd always include a link back to the home page, then one to a site map (if you have it), and then the most relevant/or starting points for the pages in the directory requested.