Welcome to WebmasterWorld Guest from 54.226.133.245

Forum Moderators: open

Message Too Old, No Replies

404 gets high rank!?

#4 is File Not Found for my keyphrase!

     
1:34 am on Mar 9, 2003 (gmt 0)

New User

10+ Year Member

joined:Feb 6, 2003
posts:15
votes: 0


How does a 404 page (File Not Found) end up with #4 for my keyphrase!? I'm #10 (#8 in www2!).

--

Just to clarify, the actual Google SERP has the text "Error - File Not Found" as the link for result #4, and the description is:
"Error - File Not Found The link you have followed may have been moved,
replaced, or removed. Please proceed back to the top level ... "

It's even cached as such.

[edited by: WileE at 1:37 am (utc) on Mar. 9, 2003]

1:36 am on Mar 9, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member heini is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Jan 31, 2001
posts:4404
votes: 0


Very likely that page has lots of inbound links carrying the search phrase. When you look at the cached version, does it say: These terms are only in links to this page?
1:39 am on Mar 9, 2003 (gmt 0)

New User

10+ Year Member

joined:Feb 6, 2003
posts:15
votes: 0


Indeed. "These terms only appear in links pointing to this page".

Still, why wouldn't google want to completely strip such pages from their results?

1:45 am on Mar 9, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member heini is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Jan 31, 2001
posts:4404
votes: 0


I guess they would want to do so, but hey, nobody is perfect.
In any case that page occupying a top spot is good news for the competing pages - just imagine a rally good page were in that spot to run against.
1:51 am on Mar 9, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 18, 2002
posts:126
votes: 0


"It's even cached as such."

I doubt that google would cache a true 404 page. I have seen pages that look like 404 pages (the page contents say "page not found" and such), but they are not actually a 404 page because the http status for the page is 200.

I wonder if the http status for the page in question is 200 (ok) instead of 404 (not found). You can check that with:

[searchengineworld.com...]

1:56 am on Mar 9, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


A common reason for this is an error in the ErrorDocument directive on an Apache server.

If the webmaster uses the directive in the form:

ErrorDocument 404 http;//www.example.com/my_error_page.html

the server will return a 302-Moved Temporarily status instead of a 404.
(See the warning about this in the Apache ErrorDocument documention.)

The correct format is:

ErrorDocument 404 /my_error_page.html

This is the first thing to check if one of your 404 pages appears in the index.

Jim

1:58 am on Mar 9, 2003 (gmt 0)

New User

10+ Year Member

joined:Feb 6, 2003
posts:15
votes: 0


You're right. It's returning a code of 200... which I suppose means that they're handling ALL pages at that site with some pre-processor or something, and if the page doesn't exist, they paint a 404ISH page. Odd.

I suppose it's better than an actual competitor site, but really I'd rather just be one higher, since that site doesn't exist!

BTW, answers within minutes. I love this place!

2:24 am on Mar 9, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 18, 2002
posts:126
votes: 0


Yes, it sounds like they have a general 404 processor, but neglected to set the http status properly. Most search engines will check for this condition and ban the site automatically within months. SE's generally don't like poorly implemented general 404 processors because they can so easy generate an infinite number of garbage pages and confuse the poor spiders.

The site will be banned even more quickly if they have no explicit robots.txt file. If that is the case, their general 404 processor will attempt to say 404 (not found), but, instead will say 200 (ok) with an VERY invalid robots.txt.

3:05 am on Mar 9, 2003 (gmt 0)

Full Member

10+ Year Member

joined:Mar 5, 2003
posts:266
votes: 0


I have one of my error pages listed in Google, despite it showing a true "404" on the SearchEngineWorld tool shown in the posts above.

What else may I be missing to allow these 404 pages into the index?

- canuck

3:09 am on Mar 9, 2003 (gmt 0)

Full Member

10+ Year Member

joined:Mar 5, 2003
posts:266
votes: 0


Upon further examination my 404 webpage doesn't have the usual "404 Not Found" in the page's title... I assume this is the problem?

- canuck

3:29 am on Mar 9, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


canuck,

No, they just look at the server response code. Yours, being correct, makes your situation pretty unique.

You may want to check for a "silent" redirect to your 404 page, for example, a mod_rewrite redirect without an [R] flag. If you're not on Apache, this is not applicable.

Jim

7:39 am on Mar 9, 2003 (gmt 0)

Full Member

10+ Year Member

joined:Mar 5, 2003
posts:266
votes: 0


jdMorgan,

Thanks, we do operate on Apache so I will definitely look into this.

- canuck

7:58 am on Mar 9, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Oct 1, 2002
posts:1580
votes: 0


>I guess they would want to do so, but hey, nobody is perfect.

LOL....The amount of time it takes Google to realize a page no longer exists is totally rediculous. I deleted several pages 6 months ago that are still indexed and were producing 404's. I recently put them back with a Javascript redirect to the home page.

404's must cause immense frustration to many people, they think the site is down, when in fact it is Google that is causing the problem.

Rule of thumb now, never delete any page. If it is out of date and totally useless just put a Javascript redirect on it, or a server side permanent redirect if you want to go to that amount of effort and untidyness.