How can a site with a 404 error be #1?

Forum Moderators: open

Message Too Old, No Replies

How can a site with a 404 error be #1?

world3d

11:58 pm on Mar 18, 2003 (gmt 0)

Here's a strange one... There is a site that was #1 for our keyword before the last update. But in the last month, they have seemingly gone out of businesss and their url beings up a 404 error-page not found. When the new update came, sure enough Google bounced them (although not completely, only to about #7).
But then, starting three days ago they popped back to #1 with a fresh tag -- even though they still show a 404 error (even in Google's Cache!). All I can guess is that they have somehow cloaked another version of the page but then wouldn't that show in Google's cache? Can anyone shed any light on this?

John_Caius

12:17 am on Mar 19, 2003 (gmt 0)

Incoming link anchor text?

world3d

12:26 am on Mar 19, 2003 (gmt 0)

Well, I understand the power of incoming links but doesn't there have to at least be a single character of content on the site? Shouldn't/Wouldn't Google bounce a site with NOTHING on it (a 404 error Site Not Found!) no matter how many incoming links it has?
Right now, the #1 result for my keyword brings searchers to an error message! Google must have something built into the algo to avoid that. No?

jdMorgan

1:27 am on Mar 19, 2003 (gmt 0)

Incoming link text is very powerful. But I suspect that 404 page will be weeded out in the next update IF the server is returning a proper 404 reponse header.

Jim

world3d

1:37 am on Mar 19, 2003 (gmt 0)

I appreciate the response and I'm sorry to keep pushing for more of an answer but...
This site DID get weeded out in the last update. Than Fresh results brought it BACK. It's as if Freshbot found somthing to like about it despite it's 404 error. Do you think this points to cloaking or does Freshbot sometimes just drag dead sites back from the grave?

jdMorgan

1:51 am on Mar 19, 2003 (gmt 0)

world3d,

Well, I sure don't know. I haven't ever followed a situation like this before.

I'd suggest sending a report to Google through the search quality or spam-reporting links on their Web site - I'm sure they'd want to know about freshbot raising this Lazurus from the dead!

I'd be interested in learning what happens if you continue to track this problem until it is resolved, and would care to post an update.

And lest my manners be found wanting due to my previous inattention, Welcome to WebmasterWorld [webmasterworld.com]! :)

Jim

2oddSox

1:52 am on Mar 19, 2003 (gmt 0)

I've found over the past couple of months or so that the number of top 10 results that return an error page for various searches that I've done seems to be on the increase (either that, or I'm just unlucky). I use Google a lot for my job and it's pretty frustrating to see a title and description that match perfectly what I'm looking for, only to have the page return an error. I could understand it if the page in question was 10 pages deep, but when it's a top 10 result on the first page it really knarls my undies. Perhaps indeed it is a result of incoming links (and that would be a very good indicator of how much emphasis Google puts on incoming links), but it sorta goes against the grain of providing quality results for the user - IMHO.

2odd...

world3d

2:21 am on Mar 19, 2003 (gmt 0)

Thanks Jim
BTW, I already have sent this as a spam report to Google. Kinda hoping to summons GoogleGuy with all of this. I wouldn't post any of it if it seemed to be a problem that was limited to my situation. This really seems to be something Google is missing (either their algo or webmaster trickery) and it leads to bad search results-- enemy to you, me and ESPECIALLY Google.
GG are you out there?

...or anybody else who can tell me that this result COULD Be natural?

If not, I'll keep you posted to any changes or response. =)

AthlonInside

2:56 am on Mar 19, 2003 (gmt 0)

This is NATURAL.

I have a site which rank #3 and it is always number #3. One day when freshbot come, my site is down so the index is a 404 error. But it is still rank #3.

The reason is simple, the main search are base on the index capture in deep crawl, fresh bot (although affect a little bit on ranking) main purpose is to show a fresh contents in the search results.

On the next day, when the fresh bot come again or maybe just the fresh content is replace back by the original deep crawl content, my site is back with the title.

So don't worry on this, it is not SPAM. The page do exist when DeepCrawl reach it. If the site owner has remove the page (only if he don't know he rank #1) then after the next deep crawl, his site will no longer exist.

And why care on a 404 that rank #1?!? The visitors won't click on a 404 link!

world3d

3:08 am on Mar 19, 2003 (gmt 0)

Hi AI.
I have to disagree on 2 points:

1) It DOES matter because they are taking up the top position and searchers don't know it's a 404 until they click on it. Also gives them a chance to choose to go to a sponsored site. Of course this knocks us down lower in Yahoo too. What if there were 50 error sites ahead of yours and you ranked 51? That would bug you, no?

2) The point that I'm trying to drive home is that this 404 page already HAS been indexed in a deep crawl. That's why even the cache has no content. It's already been dumped and brought BACK by Freshbot. That's why it's such a mystery to me.

takagi

3:46 am on Mar 19, 2003 (gmt 0)

Hi world3d, I agree it is not good for the user to find one or more 404-pages in the SERP. IMHO reporting it as spam is not the right way to inform G. It would be better to use

1. The option at the bottom of the SERP: 'Dissatisfied with your search results? Help us improve.'

2. Click on the blue smiley ('Vote against this page') in the toolbar, after opening the 404-page.

oLeon

7:20 am on Mar 19, 2003 (gmt 0)

I agree with AthlonInside:
this could be the reason for showing a error page in the serps.

there are others thinkable. the first one was that you define a errorpage to show the user a real page with text like "sorry, not found on our site". the second one was that the server didn�t throw out a error page with the header 404 but a 200. (which is the case of the first one, too)
then G is not able to delete those pages.

try this search [google.com] and see what I mean. a lot of these pages aren�t a 404-reply for the spider, so how should it know there is no page anymore it requested?
--> BTW, have a look at the 3rd result, that�s really funny!

[edited by: oLeon at 10:15 am (utc) on Mar. 19, 2003]

gsx

9:21 am on Mar 19, 2003 (gmt 0)

You could also try:

[google.com...]

If it is a genuine 404, the page may be removed. If it is a redirect, it probably would not unless you had access to the robots.txt file.

ciml

12:57 pm on Mar 19, 2003 (gmt 0)

The two best clues are from Jim in post 4. Backlinks are very powerful, but 404 URLs get de-listed.

If the main URL is a frameset with good inbound links and a good title, then it can be listed high with no real content. Then, if the frame source is a dead URL then you see the 404 error. Google won't list the 404 page (once it sees it's dead), but the frameset page itself will return a 200 response as normal and can be listed.

If this technique was used to point the frame source at something spammy looking (a common domain harvesting tactic) then I imagine that Google wouldn't look upon it kindly. With the destination frame just being a 404, however, I strongly suspect that the site is just malfunctioning.

John_Caius

1:09 pm on Mar 19, 2003 (gmt 0)

Msg #10:

"and searchers don't know it's a 404 until they click on it."

The result shown is the Google cache. If the 404 only shows up when the searcher clicks on the link then it's because the Google cache doesn't have it as a 404. So you've got to wait a bit longer for a new crawl to pick up the 404 page.