DO NOT use the Removal Tool, because it will not remove your pages from their database. The only effect would be that your pages won't show up in the SERPS for 180 days (or maybe just 90 days - G still seems to be uncertain about the exact time span). After that period all the unwanted stuff will happily re-appear.
Two choices here:
a) Return a 410 (Gone) status code and keep your fingers crossed that G bot might find the time to look at those ancient URLs again.
b) Use a "DISALLOW /outdated_stuff/" in your robots.txt and again keep your fingers crossed ...
Either way, stay away from the Removal Tool!
IMHO G has a real problem here. There doesn't seem to be any practical solution to effectively tell them to delete outdated / unwanted stuff from their database. Once they stored something, they'll keep it - until THEY decide to delete it.
Why not use a custom 404 error page to capture all that traffic and do something with it?
I don't think the pages receive much traffic - they're all in the Supplementary Results.
And I'd rather have them out of the index entirely, because I'm not sure if Google is seeing them as duplicate content in some way and that's affecting the rest of the site.
I've decided to recreate just the directory and use Petrocelli's b) option, and hope for the best. We'll see.
I have often wondered if after using the removal tool that the items that you have removed still have an affect on your ranking.
You are correct, once google grabs hold of something - it never seems to let go. I have a couple of directories on our site that have inflated pages values in google. The directories contain 200 files. However, google reports that they contain 5000 files. I have decided to move the directories to a new name and do a 301-redirect to the new name. This seems to be helping in getting the inflated page values back in line.
Curious, does anyone have any good ideas of how to get pagecount back in line when using the "Site:" command?
|And I'd rather have them out of the index entirely |
Return a 403 when Google requests those pages.
|Return a 403 when Google requests those pages |
Yes? Do you think that will help more than a 404? Could you explain why? I think part of the problem is that Google is simply not requesting the pages at all, because the cache date is from last year.
I am curious on the case why google no longer requests pages... When using "Site:" command... It says that there are approximately 80,000 pages found for our site. Our site has no mare than 20,000 pages. Where are the 60,000 other pages coming from? Tried to ask google on this but did not get a clear response. Only that index and pages counts can change at any given time.
I noticed you are answering questions on other threads... Can you put a little light on why page counts are so far off from actual page counts on a given site? Also, what is the best way to get pages out of the google database? It seems that once google grabs ahold - it never lets go.
All thoughts would be appreciated.