Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

What to do with no longer existing pages? 301? 302? 404?

How to handle the deleted pages and keep Google's Friendship

         

silverbytes

12:37 pm on Nov 3, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Though this have been discussed sometimes in some aspects I'm still looking for a straight answer about how to handle this "problem".
Your website adds new pages and deletes some others (in my case I often need to insert pages for 1 or 2 months and some of these will remain but others need to be deleted) when you delete a crawled page, Google will look for that page eternally (I see it in logs).

So chances are:

1) Ignoring that and leave Google find a 404 everytime (but users clicking on a search result that leads to a 404 page will avoid clicking later on other valid results coming from your website

2) Setting a temp redirection 3032 to another page in your website (with similar content maybe) but I don't know the cons of this option

3) Setting a permanent redirection 301 to another page in your website (with similar content maybe) but I don't know the cons of this option

4) Others

What do you say?

g1smd

12:40 am on Nov 6, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I would use a 301 for just a few weeks if another page contains useful information that would have fitted what the searcher was seeing on the old snippet in the search results for the old page.

Otherwise serve a 404. Where Google fails to remove a page, you can always upload a page with no content, and one meta tag: <meta name="robots" content="noindex"> in it.

cws3di

2:04 am on Nov 6, 2005 (gmt 0)

10+ Year Member



If you really, really want the spiders to never, never come back to check that url:

RewriteRule ^directory/ - [G]
RewriteRule ^page1.html - [G]

Instead of serving a 404, this serves a 410 Gone, indicating to the spiders that you are purposely telling them the page is really gone, rather than just missing.

jdhuk

6:07 am on Nov 6, 2005 (gmt 0)

10+ Year Member



I recently had to delete 100 pages of something that is now no longer available. So from "Error pages" in cPanel, 404 (Wrong page) I simply used a

<meta http-equiv="refresh" content="2;URL=/">

with some text "sorry this page blah blah blah"

Forgetting the "it could annoy the searcher" discussion is this something you should never do for fear of a penalty or something?

jeez so many worries about being penalized.....

g1smd

5:50 pm on Nov 6, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I don't like meta refresh. Search engines do unpredictable things when they see one... including indexing the content of the target page against the URL of the blank page that only contains the redirect.

walkman

9:31 pm on Nov 6, 2005 (gmt 0)



cws3di,
will RewriteRule ^directory/ - [G]
delete all the pages within that directory from Google? e.g., will /directory/page.html and /directory/pagea.html etc. disappear just with that line of code?

jdhuk

9:39 pm on Nov 6, 2005 (gmt 0)

10+ Year Member



Cheers g1smd, I have disabled that now.

Back to Walkmans question....

cws3di

10:12 pm on Nov 6, 2005 (gmt 0)

10+ Year Member




Yes, that will serve a 410 for everything down that directory path.

I also put in a line on that post for just a single page.

Remember that the url needs for the spider to visit in order to be serve the 410 Gone - I say this in order to warn you that it might not work for Supplemental listings in Google, since those never seem to get spidered again.

The 410 Gone works immediately for yahoo and msn, since they seem to deep crawl more often these days.

g1smd

10:26 pm on Nov 6, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If the page is disallowed in robots.txt then Google will also not be able to get in to see the page status. Do not mention the URLs there too.

kevinpate

10:46 pm on Nov 6, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> The 410 Gone works immediately for yahoo and msn,

Well, that's a tad too strong, at least in my experience. For whatever reason, both msnbot and slurp took well over over a month of gobbling down the same group of 410 codes before finally deciding I meant it, many times coming back 3x in a day for the same 410'd file. For me, Gbot actually caught on a tad quicker, but it still took a little over 3 weeks, so none of these bots were on my send them flowers list.

All 3 of the major players come back for a 404 for months and months with no end in site, so if it's gone, tell them it's gone with a 410. They definately seem to translate 404 Not Found to mean
"not found this time, but hey, feel free to come on back and check again because who knows, maybe it will be here then and this is just fluke number 600, 601, 602 ..."

cws3di

10:59 pm on Nov 6, 2005 (gmt 0)

10+ Year Member




You are right kevinpate - "immediately" was too strong. I should be more careful about how I state something.

I should have said "in my experience", yahoo and msn pick up on the 410 sooner than G because My sites get deep crawled more often by Yahoo and MSN than by G, and I have noticed that the pages drop from their indexes and serps within a relatively short period of time.

I have also seen in my logs that they come back checking the url afterwards, whether 404 or 410. But I see it as a plus that the 410 actually gets the page taken out of the index at Y and M.

walkman

11:20 pm on Nov 6, 2005 (gmt 0)



>> Remember that the url needs for the spider to visit in order to be serve the 410 Gone - I say this in order to warn you that it might not work for Supplemental listings in Google, since those never seem to get spidered again.

I had "removed" that /directory via Google's remove service (I think it just hides it), but I re-linked today from my index page and allowed it for access on robots.txt. Well see. if they go for it, a nice plate of 410 is waiting :)