Oh man. Researching a pagination issue, I realized that I think 80% of my content-rich Q&A website has been made invisible to Google, and I need help to fix it.
Here's what happened: Each of my site's Q&A's is with an expert in various fields, and I used to publish them all in a single-page view. with all Q/A content on a single page. Due to Google indexing a few variants of the same URL, I implemented the rel=canonical attribute so that my Opera Singer Q&A code included:
<link rel="canonical" href="http://mysite.com/opera-singer"/>
So far, so good. But last Summer I decided to paginate the Q&A's with 7 Q/A's on each page b/c some of them had 100+ Qs and took too long to load. So the paginated URL structure became:
Page 1: http://mysite.com/opera-singer
Page 2: http://mysite.com/opera-singer/2
Page 3: http://mysite.com/opera-singer/3
etc, etc.
Here's the megaFAIL that I just discovered. the same canonical link above (pointing to http://mysite.com/opera-singer -- page 1 of the now-paginated Q&A -- was replicated to each /2, /3, /4 paginated page(!) So correct me if I'm wrong, but I believe the net effect was that I was literally telling Google's spider to IGNORE the Q&A content that wasn't on page 1 of the now-paginated Q&A...so 80% of my content-rich site is essentially invisible to Google(!) The "proof" is that if I search Google for any text string on page 1 of a Q&A it finds it, but does NOT find any text string on pages 2+.
So now for my questions:
1) Is my interpretation of what's happening and why correct? IOW, did my (mis)use of the rel=canonical attribute tell Google to ignore all content not on paginated page 1?
2) I changed the code for each paginated page to reflect the new paginated URL structure...so URL http://mysite.com/opera-singer/2 now uses <link rel="canonical" href="http://mysite.com/opera-singer/2"/>, and so on...and then I went to WMT and forced a recrawl of the entire site. Is there anything else I can do? I'm a little concerned because it's been 5 days, and Google still can't find any of my content that's NOT on the first page of a Q&A. Is it possible that simply changing the canonical links as described and recrawling will NOT work because Google has become permanently 'blind' to them, since for the last 6 months I had been telling it to ignore them? Is it 'smart' enough to realize 'Hey, I'm no longer being told to ignore this content, and wow it's entirely new and different than what this page's canonical link had previously been pointing me to'?