Welcome to WebmasterWorld Guest from 54.224.13.210

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Removing a large set of pages from Google's index

     
11:38 am on Mar 11, 2014 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 8, 2003
posts: 516
votes: 4


I changed the publishing platform of a fairly large blog from Movable type to Wordpress.

Movable type paginated by using queries -
mywebsite.com/index.php?page=23

however Wordpress does the same by -
mywebsite.com/page/23/

Based on this Google crawled thousands of pages using a combination of query strings -
mywebsite.com/page/23/?page=1
mywebsite.com/page/23/?page=23
So to get rid of the pages, I created a rule in htaccess which delivers a 404 for all pages with the query "page".

I can see a lot of crawl errors for these queries in my webmaster tools. Now I want to get these pages removed from Google's index.
So what should I do - mark them as fixed so Google crawl them again and eventually deletes them or
just ignore the errors and they will go automatically?
12:51 pm on Mar 11, 2014 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11029
votes: 94


i would use mod_rewrite to 301 redirect all requests with a page= paramter in the query string to the canonical wordpress url.
1:01 pm on Mar 11, 2014 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 8, 2003
posts: 516
votes: 4


You mean redirecting mywebsite.com/?page=12 to mywebsite.com/page/12/ ?

Well I did that for a year with .htaccess, but searching on Google with the site parameter still yielded results with the query mark.

So I resorted to 404.
2:31 pm on Mar 11, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member Top Contributors Of The Month

joined:July 19, 2013
posts:1097
votes: 0


I think phranque means redirect:

mywebsite.com/page/23/?page=23

to

mywebsite.com/page/23/

And you can't remove redirects after a year or 5 or 10 and hope any major SE will just stop spidering them, because there are many times URLs are reverted or reused after a 301 is in place, so they keep checking periodically to make sure the redirect is still in place.
2:32 pm on Mar 11, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:June 19, 2005
posts: 335
votes: 5


Removing pages from Google takes a very long time. I've had thousands of pages go noindex and two months later, they are still in the index.
1:41 pm on Mar 12, 2014 (gmt 0)

New User

5+ Year Member

joined:Mar 13, 2011
posts:6
votes: 0


It is difficult to remove page from google index, Googlebot remember every url, never forget.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members