Marketing_Guy - 4:33 pm on Oct 10, 2011 (gmt 0)
Seeing big indexed pages changes today. Had a client site (which was fubar before I took over last month) drop from 16k pages to 2.2k. They only have 1.8kish pages - their old CMS created 56k odd pages - I 404/301 them as appropriate and they've been dropping by about 1k per day until this morning when they jumped down to 2.2k.
A site I own was hacked a while back (I didn't notice - not really paying attention to it) - 22k pages of spam content uploaded taking the total indexed to over 50k. Deleted the spam folder on Friday and they're all out of the index today.
A third site is a WP blog which had loads of /tags indexed. I blocked them via robots.txt when Panda first hit earlier in the year (International roll out) - but they've stayed in the index until this morning.
Also seen a former client's site drop from 850k pages to 160k (IMO they should be around the 350k mark but the replaced me with a moron so /shrug). There was loads of thin content which looks like it has been blitzed.
Just speculating, but Bing have moved into the "we're only indexing good content - not every piece of content" approach to search. Maybe Google are following suit? Seems like a fairly significant clean up of junk URLs IMO.