homepage Welcome to WebmasterWorld Guest from 50.17.66.61
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Removing a large set of pages from Google's index
morpheus83

10+ Year Member



 
Msg#: 4652967 posted 11:38 am on Mar 11, 2014 (gmt 0)

I changed the publishing platform of a fairly large blog from Movable type to Wordpress.

Movable type paginated by using queries -
mywebsite.com/index.php?page=23

however Wordpress does the same by -
mywebsite.com/page/23/

Based on this Google crawled thousands of pages using a combination of query strings -
mywebsite.com/page/23/?page=1
mywebsite.com/page/23/?page=23
So to get rid of the pages, I created a rule in htaccess which delivers a 404 for all pages with the query "page".

I can see a lot of crawl errors for these queries in my webmaster tools. Now I want to get these pages removed from Google's index.
So what should I do - mark them as fixed so Google crawl them again and eventually deletes them or
just ignore the errors and they will go automatically?

 

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4652967 posted 12:51 pm on Mar 11, 2014 (gmt 0)

i would use mod_rewrite to 301 redirect all requests with a page= paramter in the query string to the canonical wordpress url.

morpheus83

10+ Year Member



 
Msg#: 4652967 posted 1:01 pm on Mar 11, 2014 (gmt 0)

You mean redirecting mywebsite.com/?page=12 to mywebsite.com/page/12/ ?

Well I did that for a year with .htaccess, but searching on Google with the site parameter still yielded results with the query mark.

So I resorted to 404.

JD_Toims

WebmasterWorld Senior Member Top Contributors Of The Month



 
Msg#: 4652967 posted 2:31 pm on Mar 11, 2014 (gmt 0)

I think phranque means redirect:

mywebsite.com/page/23/?page=23

to

mywebsite.com/page/23/

And you can't remove redirects after a year or 5 or 10 and hope any major SE will just stop spidering them, because there are many times URLs are reverted or reused after a 301 is in place, so they keep checking periodically to make sure the redirect is still in place.

dethfire

5+ Year Member



 
Msg#: 4652967 posted 2:32 pm on Mar 11, 2014 (gmt 0)

Removing pages from Google takes a very long time. I've had thousands of pages go noindex and two months later, they are still in the index.

kawen



 
Msg#: 4652967 posted 1:41 pm on Mar 12, 2014 (gmt 0)

It is difficult to remove page from google index, Googlebot remember every url, never forget.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved