Forum Moderators: Robert Charlton & goodroi
So chances are:
1) Ignoring that and leave Google find a 404 everytime (but users clicking on a search result that leads to a 404 page will avoid clicking later on other valid results coming from your website
2) Setting a temp redirection 3032 to another page in your website (with similar content maybe) but I don't know the cons of this option
3) Setting a permanent redirection 301 to another page in your website (with similar content maybe) but I don't know the cons of this option
4) Others
What do you say?
Otherwise serve a 404. Where Google fails to remove a page, you can always upload a page with no content, and one meta tag: <meta name="robots" content="noindex"> in it.
If you really, really want the spiders to never, never come back to check that url:
RewriteRule ^directory/ - [G]
RewriteRule ^page1.html - [G]
Instead of serving a 404, this serves a 410 Gone, indicating to the spiders that you are purposely telling them the page is really gone, rather than just missing.
<meta http-equiv="refresh" content="2;URL=/">
with some text "sorry this page blah blah blah"
Forgetting the "it could annoy the searcher" discussion is this something you should never do for fear of a penalty or something?
jeez so many worries about being penalized.....
I also put in a line on that post for just a single page.
Remember that the url needs for the spider to visit in order to be serve the 410 Gone - I say this in order to warn you that it might not work for Supplemental listings in Google, since those never seem to get spidered again.
The 410 Gone works immediately for yahoo and msn, since they seem to deep crawl more often these days.
Well, that's a tad too strong, at least in my experience. For whatever reason, both msnbot and slurp took well over over a month of gobbling down the same group of 410 codes before finally deciding I meant it, many times coming back 3x in a day for the same 410'd file. For me, Gbot actually caught on a tad quicker, but it still took a little over 3 weeks, so none of these bots were on my send them flowers list.
All 3 of the major players come back for a 404 for months and months with no end in site, so if it's gone, tell them it's gone with a 410. They definately seem to translate 404 Not Found to mean
"not found this time, but hey, feel free to come on back and check again because who knows, maybe it will be here then and this is just fluke number 600, 601, 602 ..."
I should have said "in my experience", yahoo and msn pick up on the 410 sooner than G because My sites get deep crawled more often by Yahoo and MSN than by G, and I have noticed that the pages drop from their indexes and serps within a relatively short period of time.
I have also seen in my logs that they come back checking the url afterwards, whether 404 or 410. But I see it as a plus that the 410 actually gets the page taken out of the index at Y and M.
I had "removed" that /directory via Google's remove service (I think it just hides it), but I re-linked today from my index page and allowed it for access on robots.txt. Well see. if they go for it, a nice plate of 410 is waiting :)