Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google showing 10K pages

G glitch or my problem

         

chunk_split

9:20 pm on Apr 1, 2006 (gmt 0)

10+ Year Member



Checking Google index for one of my sites using, site:domain results in 10K pages, although the site only has around 1k. Is this a common glitch?

tedster

9:57 pm on Apr 1, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It is very common for Google to over-report your total number of pages.

However -- and this is especially true if you are having trouble coming up on appropriate searches -- it is a good idea to look through all the urls you can see in the site: query and discover if any urls in there are problematic. That is, look for any urls that do resolve but "shouldn't" because you intended a different url for that content.

daveVk

4:38 am on Apr 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



See this thread [webmasterworld.com ] google reports sites of over 1000 pages at about 10 by real value, see thread for qualifications.

tedster

4:50 am on Apr 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I see some sites at 1000+ pages with nearly correct counts, but others at 3x, 5x, and 10x, even higher.

When I can find a way to tweeze out some detail, I often find that Google has been very creative in getting duplicate, triplicate and higher urls for the same content. Also, the supplemental index often comes into play. If a dynamic website has been redeveloped in the past year or two, look out. However, those supplemental urls are not usually any problem at all if their present time handling on the server is accurate.

Watch out for your 404 handling. If lots of urls should be 404 but in reality are returning a 200 you have a problem that can greatly inflate page count. Also, if you think you have a 301 redirect but really have a 302, then there are ofte two urls in the Google index where you were aiming for one.

daveVk

10:27 am on Apr 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If google has gone to trouble of going to supplimentals then page count appears to be correct. If not, inflation factor seems to come into effect perhaps as a guess of how many it may have found is it went to supps. Following rules from prior thread is my best guess "A" is reported 'about' figure.

1 - SERPS includes supplimentals, then total (including sups.) = A.
2 - A < 1001, then total(excluding sups.) = A.
3 - A < about 100000, then total(excluding sups.) = aprox( A/10 ).
4 - For large A, inflation factor seems to vary.

BillyS

2:06 pm on Apr 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



chunk_split -

I agree with everything that's been said. It's always best to make sure you website isn't producing stray / duplicate pages. That being said, you can still try to figure out what the actual page count is by using negative and positive phrases.

We're not allowed to post actual examples, but here is something to consider.

-whatever site:www.site.tld
whatever site:www.site.tld

If you have a word (whatever) that you know appears on about 20% of your pages, then this will work (given the stated size of your site). If you add the count of these two queries together, you should be able to figure out the approximate page count.

g1smd

4:10 pm on Apr 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I find it easier to do site:domain.com/folder searches, or alternatively you can try domain.com inurl:foldername and domain.com -inurl:foldername searches.

Do bear in mind that these do NOT work with any punctuation in the folder names being included in the search query, like ~username for example.

chunk_split

7:16 pm on Apr 2, 2006 (gmt 0)

10+ Year Member



BillyS,

I read through the thread linked from post #3, and tried the positive/negative word test.

site:domain.tld = 9,240
widget site:domain.tld = 623
-widget site:domain.tld = 394

Thanks.