Welcome to WebmasterWorld Guest from 220.127.116.11
We have several old URL's (approximately 200) that are currently issuing 301 redirects but we are considering using the Google Webmaster "Remove URL" Tool. We can easily switch these pages from issuing 301's to 404's as instructed in the webmaster tools however, we would like to hear from others any experiences in using this tool.
We do not want any mistakes or directories removed. Just several individual pages that are no longer in use. The 301's are taking a little too long and thought this might be a nicer way of doing things for now and future. However, we fear the worst that we would submit an individual page only to have it remove the page + the whole directory above the URL being removed as well.
Has anyone used this tool for individual URL's/pages and to what success? Or would this be unadvised way of removing individual URL's?
If it is not recommended, should we stick with the 301's or do you think switching them to 404's would be more advisable?
Thank you for any responses and your time,
Are these urls still showing in the search results? That should not be the case if your 301 redirects are working properly and the server is sending a 301 status in the http header.
Another question is whether the urls involved are actually redirected to good content - content that is actually a viable replacement for the original content. If so, leave the 301 redirect. If not, change to 404.
In either case, the url should fall from the Google search results without you doing anything. Google may have been a bit slow on automated removal of 301 and 404 urls recently, since they were involved in an apparently intensive change to the algo.
Let me try to explain the mess we have made for ourselves and why we implemented 301’s and hopefully will answer your questions as well.
First off we have an old site (about 9 years) which contains both a content side and a store side. On the store side, we decided to make some adjustments to the “flow” and it is something we now regret.
What we had for approximately 8 years and was doing just fine:
What we changed to:
The 2 new ones you see were now above all the old directories which were nicely ranked and quite well placed. We were very stupid and did not think out the implications of what seemed like such a small change. Really, it was quite huge. We had just completely redefined the flow of the store. *ouch* So, the bot rearranged all the old categories under these new pages with no rank and everything below it suffered. We waited and waited... and waited and it seemed that things were just not going to recover and in fact, things were just getting worse. The more that it realized was now under these new pages, the more things would sink.
We thought that the new way would make more sense to site visitors. As it turns out, it really didn’t make any improvement for visitors. We have inquired to many of them and opinion was, it was just fine the original way. So we are now back to the first way.
So, we decided we no longer wanted the new top level departments and brands and wanted to revert back to our old way which ranked perfectly fine.
So we took:
and 301 redirected them all to:
This seemed the most logical place to redirect them to. Maybe we should have just done a 404 on all of them....
To your questions:
1: Are they showing in the search results? Well, they never did all that well as they were new so the very few we did find, are either falling or dropping out. I don’t think the bot has done a deep enough crawl to figure out yet that all these pages are now 301. They are still in the “site:example.com” results and so far only a small handful of the approximately 250 have in fact disappeared.
2: Are they showing 301 status in the http header? Yup, sure are. ;)
3: We are redirecting them all to www.example.com/store.html. There are approximately 250 of these department and brand pages that we are letting go of.
One thing to know, the instant we reverted back to the old way, we started seeing some sales flowing back in again. We were feeling quite encouraged. The new departments and brands started falling in the “site:example.com” results and moving the old ones back up in the ranks. Then, about a week or so after that, those darn top level departments and brands started creeping back up in the “site:example.com” results and pushing the old ones back down! At the same time, sales started teetering off again. This is why I am writing and asking if maybe we should be doing 404’s or even requesting the deletion of these pages in the Webmaster Tools. We sure made a mess and we are well aware of it. We know the bot must be utterly confused by this mess so we want to figure out the best way to release these older (good) pages from the wrath of these new top level departments and brands pages that we have 301’ed. They seem to be holding them down and until the bot figures out what is attached to what, I don’t think we can start moving forward again. As long as the bot seems to think they exist, we seem to have a problem.
Lesson learned for us, never move nicely ranked and nicely positioned pages BELOW new pages. All the good stuff seems to have lost its “oomph”!
I hope this all makes sense... and thank you again.
P.S. This has been going on for about the last 3 months and the content side is OK.
Bet you didn't expect a novel back. ;)
[edited by: tedster at 10:21 pm (utc) on July 22, 2009]
[edit reason] switch to example.com - it can never be owned [/edit]
When you make a previous url go 404 or 410, googlebot will continue to request it (with decreasing frequency) just to be sure you haven't changed your mind. If that is problematic for you, you can always disallow the pattern in your robots.txt file.
Lesson learned for us, never move nicely ranked and nicely positioned pages
I'm with you! And I can tell you that, unfortunately, many other businesses have also learned this lesson the hard way.
It sounds like what you did was disrupt the internal link flow - is that correct?
Yes, that is 100% correct. The link flow for the store was completely disrupted from this change. In our heads before we did it, it didn't seem so disruptive. Hindsight says, "good lord, what were we thinking..."
Ok, so you suggest 404 or 410. Even if the bot requests the page, as long as it is a 404 or 410 I am assuming it will stop being part of the disruption? As soon as it knows it is a 404 or 410 maybe it will just be set aside so the rest of the store can sort out? Am I understanding that correctly? I don't want them to linger around as top level 404 pages and just hold the rest of the store under their wrath.
Our logic was that if they were 301, the bot would hit it, then redirect and dump. 404 concerned us as a longer waiting period but we know we could be totally wrong. After all, look at the mess we just made!
I hope maybe at least one business out there reads this before making such a change so they don't have to also learn this lesson the hard way like us and many others. We feel so stupid.
P.S. Has it been known for business to recover from such link flow disruption? When they fixed the problem, were they able to get their standings back? The bot liked our store before we made the mess, I am hoping it will like it again once it's all fixed.
[edited by: KrisE at 10:56 pm (utc) on July 22, 2009]
Ok, so we just decide on 404 vs 410. 410 sounds like it might be more logical.
Thank you. :)
Spidering these days is often not done by the old-fashion "crawl the links" approach. Instead, the crawl team has an algo that assigns a "crawl budget" as well as a list of previously identified urls to request.
It is also possible to knock a url into the supplemental index if you remove too much of its internal backlink support. Then it would see much less frequent spidering, because its indexing had changed.
[edited by: tedster at 11:18 pm (utc) on July 22, 2009]
Switching the 301's to either logical 301's or changing them completely to 404's or 410's is just to get rid of the pages at some point.
A huge chunk of these pages are still cached from before we switched back so my guess is it just hasn't spidered enough yet.
Right now it has been nearly a month and 90% of the pages that were 301 and now 404's are sitting high in the site:example.com/ results.
In the past we have seen them 301's or 404's drop very low to the bottom of those results pretty quickly but not this time. Instead each day we just see more of them moving up in that "site" result which suggests the bot hasn't even found that they have changed.
The sooner we can help the bot to realize these are gone so it can calculate the correct results, the better.
BTW, a general method for choosing the correct response for an obsolete URL is:
If a 301 is used, then plan on leaving it in place forever. It only "works" to preserve the PageRank/Link-popularity of the old URL --and to re-capture the old URL's link and bookmark traffic-- for as long as it's still in place.
Web site URL 'systems' should be designed, and not left to develop haphazardly. Spend ten times longer designing the URL-structure of your site and the underlying directory structure (not necessarily even similar to the URL-structure) as you do implementing redirect/rewrite code, and that should be about right. Search engines 'hate it' when URLs are changed or removed [w3.org]; They see the Web as a library with only a very-slowly-changing inventory, not as a corner newspaper/magazine stand with "contents updated daily."
I agree with you 100% and for several years our URL system had been relatively the same with a few minor changes over the years. This was the largest and stupidest change we had made which is why we are trying to return back to our old URL system.
I hope once it sees we have returned to the old way, things will smooth out again.
I do wonder though... If there is no way to naturally get to any of these URLs through our site anymore, (the 404's and 301's), how will the bot know to fix them? Are we relying on data the bot has stored and it eventually taps those old pages only to find they are gone or moved?
I only ask that one because we were told a few days ago (from a local neighbor who also has an online business) we should have left up a path for those old links to be followed and seen as 404's or 301's. By removing that path for them to be seen is making the process that much longer.
I don't want to act on that suggestion unless you guys at webmaster world in fact agree with such a statement.
Thanks again for the help. It has been greatly appreciated.
We just are so eager to see them drop out of the "site:example.com" results but it's obviously coming down to patience now.
Wanted to make sure we had everything in place properly according to advice and I think we are set up now so now we just pray to the google gods, do a few rain dances and wait it out.
I thought I was done asking questions under this post but it was requested of me from a collegue to ask this final one. ;)
For the 301's that are still in tact (because they made total sense to which page we would point them), which PR will the bot eventually take?
As explained above, we introduced top level departments to our store which threw everything below into hell. So, we have removed them and used 301 on many of them that made sense.
The worry, the pages that we are removing had 0 PR. The old pages that they are now pointing to have 2-5 PR. I hope this doesn't mean that because we 301'ed 0 PR pages to 2-5 PR pages, the 2-5 PR pages will switch to 0 PR!
That was a tongue twister to even type...
We didn't change the whole site. We only added about 250 pages to our store that were above the previous pages. Basically a new click flow we thought would be more logical. It just ended out causing us havoc by pushing all the ranking pages lower in click flow. So it's not like we changed the whole site of URL structures.
We have never had any problems with .html's. The whole rest of the site is indexed fine. The store was fine too until we added those 250 pages above the old pages.
So hopefully I didn't give the impression it was whole site. We have about 40,000 pages. 250 were the new ones that caused problems and it's those 250 we are trying to remove to get back to our old click flow. Anything below those 250 are now in a pitiful limbo. The rest of the pages that have nothing to do with the store are just fine.
It's sad because the site was very well thought out. Somehow we all had one massive unanimous brain fart and made a stupid mistake. If only we had made this mistake on an area of the site that isn't related to making money!
I wanted to thank you both for your great advice. We implemented the suggestions and sat tight and sure enough, everything is falling back into place. Those pages we pushed lower by changed our click path are returning into position while the 301's are falling out of the index and "site:" results. We are very pleased. Hopefully the trend will continue but I wanted to update you guys as you did help keep us focused rather than making an even bigger mess.
I am glad I came here to ask my questions!