I've got a site that shows an explosive growth of the "Non Selected" URLs (the green graph on the Index Status->Advanced page). It started off at about 1/2 of the amount of indexed URLs last year, then ran almost exactly at the amount of indexed for almost 6 months and then the graph just took off and is now at twice the amount of the indexed URLs, last week showing the biggest weekly jump ever.
What do you guys think can be inferred from this? Is this "just" a huge waste of the crawling budget or is there something seriously wrong with the site that causes Google to ignore 2/3rds of the URLs it "thinks" it has. I wish they would indicate what was the reason the URLs were ignored.
Google's definition of Not Selected:
Not selected: Pages that are not indexed because they are substantially similar to other pages, or that have been redirected to another URL
There haven't been any URL structure changes during the period reported (although there were some before that), so there would not be an influx of 301s. Does Google just recall old URLs from time to time and simply adds them the the tally? Or invent their own?
Has anyone ever used the info to troubleshoot a site? I would appreciate any insight on this. Thanks!