homepage Welcome to WebmasterWorld Guest from 54.211.219.68
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
404 gets high rank!?
#4 is File Not Found for my keyphrase!
WileE

10+ Year Member



 
Msg#: 10186 posted 1:34 am on Mar 9, 2003 (gmt 0)

How does a 404 page (File Not Found) end up with #4 for my keyphrase!? I'm #10 (#8 in www2!).

--

Just to clarify, the actual Google SERP has the text "Error - File Not Found" as the link for result #4, and the description is:
"Error - File Not Found The link you have followed may have been moved,
replaced, or removed. Please proceed back to the top level ... "

It's even cached as such.

[edited by: WileE at 1:37 am (utc) on Mar. 9, 2003]

 

heini

WebmasterWorld Senior Member heini us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 10186 posted 1:36 am on Mar 9, 2003 (gmt 0)

Very likely that page has lots of inbound links carrying the search phrase. When you look at the cached version, does it say: These terms are only in links to this page?

WileE

10+ Year Member



 
Msg#: 10186 posted 1:39 am on Mar 9, 2003 (gmt 0)

Indeed. "These terms only appear in links pointing to this page".

Still, why wouldn't google want to completely strip such pages from their results?

heini

WebmasterWorld Senior Member heini us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 10186 posted 1:45 am on Mar 9, 2003 (gmt 0)

I guess they would want to do so, but hey, nobody is perfect.
In any case that page occupying a top spot is good news for the competing pages - just imagine a rally good page were in that spot to run against.

Jack_Straw

10+ Year Member



 
Msg#: 10186 posted 1:51 am on Mar 9, 2003 (gmt 0)

"It's even cached as such."

I doubt that google would cache a true 404 page. I have seen pages that look like 404 pages (the page contents say "page not found" and such), but they are not actually a 404 page because the http status for the page is 200.

I wonder if the http status for the page in question is 200 (ok) instead of 404 (not found). You can check that with:

[searchengineworld.com...]

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 10186 posted 1:56 am on Mar 9, 2003 (gmt 0)

A common reason for this is an error in the ErrorDocument directive on an Apache server.

If the webmaster uses the directive in the form:
ErrorDocument 404 http;//www.example.com/my_error_page.html
the server will return a 302-Moved Temporarily status instead of a 404.
(See the warning about this in the Apache ErrorDocument documention.)

The correct format is:
ErrorDocument 404 /my_error_page.html

This is the first thing to check if one of your 404 pages appears in the index.

Jim

WileE

10+ Year Member



 
Msg#: 10186 posted 1:58 am on Mar 9, 2003 (gmt 0)

You're right. It's returning a code of 200... which I suppose means that they're handling ALL pages at that site with some pre-processor or something, and if the page doesn't exist, they paint a 404ISH page. Odd.

I suppose it's better than an actual competitor site, but really I'd rather just be one higher, since that site doesn't exist!

BTW, answers within minutes. I love this place!

Jack_Straw

10+ Year Member



 
Msg#: 10186 posted 2:24 am on Mar 9, 2003 (gmt 0)

Yes, it sounds like they have a general 404 processor, but neglected to set the http status properly. Most search engines will check for this condition and ban the site automatically within months. SE's generally don't like poorly implemented general 404 processors because they can so easy generate an infinite number of garbage pages and confuse the poor spiders.

The site will be banned even more quickly if they have no explicit robots.txt file. If that is the case, their general 404 processor will attempt to say 404 (not found), but, instead will say 200 (ok) with an VERY invalid robots.txt.

canuck

10+ Year Member



 
Msg#: 10186 posted 3:05 am on Mar 9, 2003 (gmt 0)

I have one of my error pages listed in Google, despite it showing a true "404" on the SearchEngineWorld tool shown in the posts above.

What else may I be missing to allow these 404 pages into the index?

- canuck

canuck

10+ Year Member



 
Msg#: 10186 posted 3:09 am on Mar 9, 2003 (gmt 0)

Upon further examination my 404 webpage doesn't have the usual "404 Not Found" in the page's title... I assume this is the problem?

- canuck

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 10186 posted 3:29 am on Mar 9, 2003 (gmt 0)

canuck,

No, they just look at the server response code. Yours, being correct, makes your situation pretty unique.

You may want to check for a "silent" redirect to your 404 page, for example, a mod_rewrite redirect without an [R] flag. If you're not on Apache, this is not applicable.

Jim

canuck

10+ Year Member



 
Msg#: 10186 posted 7:39 am on Mar 9, 2003 (gmt 0)

jdMorgan,

Thanks, we do operate on Apache so I will definitely look into this.

- canuck

percentages

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 10186 posted 7:58 am on Mar 9, 2003 (gmt 0)

>I guess they would want to do so, but hey, nobody is perfect.

LOL....The amount of time it takes Google to realize a page no longer exists is totally rediculous. I deleted several pages 6 months ago that are still indexed and were producing 404's. I recently put them back with a Javascript redirect to the home page.

404's must cause immense frustration to many people, they think the site is down, when in fact it is Google that is causing the problem.

Rule of thumb now, never delete any page. If it is out of date and totally useless just put a Javascript redirect on it, or a server side permanent redirect if you want to go to that amount of effort and untidyness.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved