homepage Welcome to WebmasterWorld Guest from 54.205.254.108
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
how long until Google ignores an orphan page
LateNight




msg:201124
 8:46 am on Mar 8, 2004 (gmt 0)

I have isloated a page that now has no inbound links from other sites and the site itself. Google seems to be finding it.....should I just delete the page? I have a custom error page so deleting the page would work. How do others put an end to a page - other than the redirect stuff.

 

seindal




msg:201125
 4:44 pm on Mar 8, 2004 (gmt 0)

If you want it to go away, just remove it. It is not illegal to return a 404 Not Found response :-)

walkman




msg:201126
 5:04 pm on Mar 8, 2004 (gmt 0)

delete it and submit the previous URL to Google. It will eventually visit and find nothing.

t2dman




msg:201127
 10:01 pm on Mar 8, 2004 (gmt 0)

Can take a while. I have a 11 feb Google cache page for a PR5 page I deleted 14 Feb. I have a link to the defunct url from PR6 index and it gets perm redirected back to the index page. I thought the link would mean Google would drop it fast, since new pages get up in a matter of days.

g1smd




msg:201128
 10:44 pm on Mar 8, 2004 (gmt 0)

For search terms where there are less than ~50 results returned, the page could continue to show up as a Supplemental Result almost forever.

Google keeps those critters for a verrry long time.

ciml




msg:201129
 10:28 am on Mar 9, 2004 (gmt 0)

Yep, I can see orphans from November. Linking to the URL and returning 404 from it is the quickest easy way to get rid of URLs you don't want.

sem4u




msg:201130
 10:33 am on Mar 9, 2004 (gmt 0)

There seem to be quite a few 404s out there in the index at the moment :(

ciml




msg:201131
 11:32 am on Mar 9, 2004 (gmt 0)

sem4u, 404s normally remain in Google when the URLs are /robots.txt excluded (in which case Googlebot cannot see the HTTP header) or when there is no link to the page (in which case it can take a long time while for the robot to request it).

I don't think I've come across a case of Google fetching URLs that return 404, and then not removing them speedily.

beakertrail




msg:201132
 3:41 pm on Mar 9, 2004 (gmt 0)

I have had similar errors with google continuing to index orphan pages.

There is likely to be a period of time where the crawler is served a 404 but does not remove the page from the listing in case it was an accidental deletion.

Not sure what this is though. What I have done where major reconstructions have resulted in 404 pages being returned frequently until a reindex is to customise the 404 to provide more helpful information similar to that of a sitemap.

Beaker

jtbell




msg:201133
 4:32 pm on Mar 9, 2004 (gmt 0)

If you want it to go away, just remove it. It is not illegal to return a 404 Not Found response :-)

Better yet, remove it and set up a redirect that returns a 410 Gone response. Google might drop the dead listing quicker that way, since this makes clear that it's not a temporary error caused by forgetting to update a link.

carraig




msg:201134
 7:47 pm on Mar 9, 2004 (gmt 0)

Hello,
I would be grateful for any clarification here.

I have several almost duplicate pages up at the moment as I don't want the visitor finding 404's Page Not Found. So should I remove the content from the "almost duplicate" pages and do a redirect? Are there different flavours of redirect? Is there some flavour of redirect that Google doesn't like?

Thanks for any help.

Best,
Carraig

g1smd




msg:201135
 9:44 pm on Mar 9, 2004 (gmt 0)

Interesting that for a domain long gone, but with many links still pointing to it, a link:www.domain.com/ still brings up a valid list.

Putting the domain back on line, resulted in a #1 listing about 3 days later. Have now donated the domain to someone else (related subject) to put their new content on.

scottiecla




msg:201136
 3:52 am on Mar 11, 2004 (gmt 0)

>So should I remove the content from the "almost duplicate" pages and do a redirect? Are there different flavours of redirect?

Do a 301 permanent redirect from the URL's that you are moving to the pages you want to keep.

You defintely need to delete or rename pages that you don't want found- I have a few that have been orphans for a year and a half that still get traffic from Google for obscure searches.

newsphinx




msg:201137
 7:50 am on Mar 11, 2004 (gmt 0)

Someone from Google says that Googlebot can find pages unlinked.

If you have content you don't want the bot to find be sure to put the robots.txt file up to keep it out. The bot, says Craig, can find content that's unlinked. That's right, the Google bot can find single pages dangling unlinked in space. He didn't explain how this happens.

Details can be found here.
[****.com ]

scottiecla




msg:201138
 2:21 pm on Mar 11, 2004 (gmt 0)

I imagine the Google Toolbar might have something to do with that...

SyntheticUpper




msg:201139
 2:59 pm on Mar 11, 2004 (gmt 0)

A nice potential spam technique (I give this a lot of thought these days, ever since Google started awarding spammers with top positions.)

Put up several dupe pages, each with different word stems (esp. in title) to ensure that at least one will rank highly. Wait for the new daft Google to pick them up, then put a robots exclude on those pages.

They'll be in the index for weeks, because G is very slow to orphan pages.

You can't be done for it either - after all - there's a no-index in place.

Do you think I approve of it? - well I don't.

Have I done it? - no.

So why am I suggesting it?

Simple - Google is turning us all into spammers. Small niche sites have suffered, but the big ones have risen, like turds in dark water, to the top.

Might as well subvert it.

rfgdxm1




msg:201140
 4:37 pm on Mar 14, 2004 (gmt 0)

>I have isloated a page that now has no inbound links from other sites and the site itself.

Are you absolutely, poitively sure of that? Somebody could have posted that URL on some minor message board, and Google keeps finding this page that way. And, it should be noted that even if there are really no inbound links now, this could change tomorrow. Someone might find that URL tomorrow searching Google, and post it on a message board on the Net somewhere. So long as you leave this page up, it may never disappear from Google.

petehall




msg:201141
 11:57 am on Mar 15, 2004 (gmt 0)

If you use a custom 404 error page, how do you return a 404 error code or even a 410 error code as someone suggested?

I have just added the following code to our custom 404 error pages - hopefully this will remove all our old pages?

<meta name="robots" content="noindex">

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved