Forum Moderators: Robert Charlton & goodroi
However -- and this is especially true if you are having trouble coming up on appropriate searches -- it is a good idea to look through all the urls you can see in the site: query and discover if any urls in there are problematic. That is, look for any urls that do resolve but "shouldn't" because you intended a different url for that content.
When I can find a way to tweeze out some detail, I often find that Google has been very creative in getting duplicate, triplicate and higher urls for the same content. Also, the supplemental index often comes into play. If a dynamic website has been redeveloped in the past year or two, look out. However, those supplemental urls are not usually any problem at all if their present time handling on the server is accurate.
Watch out for your 404 handling. If lots of urls should be 404 but in reality are returning a 200 you have a problem that can greatly inflate page count. Also, if you think you have a 301 redirect but really have a 302, then there are ofte two urls in the Google index where you were aiming for one.
1 - SERPS includes supplimentals, then total (including sups.) = A.
2 - A < 1001, then total(excluding sups.) = A.
3 - A < about 100000, then total(excluding sups.) = aprox( A/10 ).
4 - For large A, inflation factor seems to vary.
I agree with everything that's been said. It's always best to make sure you website isn't producing stray / duplicate pages. That being said, you can still try to figure out what the actual page count is by using negative and positive phrases.
We're not allowed to post actual examples, but here is something to consider.
-whatever site:www.site.tld
whatever site:www.site.tld
If you have a word (whatever) that you know appears on about 20% of your pages, then this will work (given the stated size of your site). If you add the count of these two queries together, you should be able to figure out the approximate page count.