Welcome to WebmasterWorld Guest from

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Wrong number of indexed pages with site: operator

12:52 pm on Apr 4, 2012 (gmt 0)

New User

5+ Year Member

joined:Aug 28, 2010
posts: 8
votes: 0

I have a small website with 43 pages...it used to be over 100 pages prior to October 13 panda update. I deleted pages that I though they are duplicate and thin and submitted a request through Google webmaster tool for removal from Google index. Google processed my request within few days but the indexed pages count remained the same.

Now when I do a search using the site: operator, I still see 116 pages indexed. But when I reach the end of the SERP, it says the following:

In order to show you the most relevant results, we have omitted some entries very similar to the 43 already displayed.
If you like, you can repeat the search with the omitted results included.

When I rerun the query again with "omitted results included", all I get is links to JavaScript on my site at the end of the SERP.

Do you think I have some problem on my website? Why Google is showing me wrong indexed pages count? Do you think this will affect my ranking?

8:43 pm on Apr 4, 2012 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member andy_langton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 27, 2003
votes: 140

The count you see is an estimate, and dependent on the individual site, may be much greater or less than the pages you believe exist.

Much depends on how "relevant" Google believes URLs are, to the particular search query. Site: searches are awkward, since there is no specific keyword relevance at all.

That's why even searches that are essentially identical can return different results:

site:www.webmasterworld.com [google.co.uk] (I see 497,000 results)

site:www.webmasterworld.com inurl:www [google.co.uk] (I see 591,000 results)

So that's 100k URLs from nowhere! The reason the estimated count goes up is that the second query is seen by Google as "deeper" and thus retrieves results from a broader part of Google's databases - including those that contain "low quality" results, which might be very old, errors - even deleted pages that hang around.

For small sites, you tend to see the opposite effect - the numbers are low enough for Google just to retrieve everything, so you get all those "low quality" results included in the count straight away.

Overall, though, you are better relying on what you know you've done, rather than worrying too much about the count. Although Google does have a very long memory! ;)
8:49 pm on Apr 4, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
votes: 0

Click on the "show omitted results" link, and then click through to the last page of those results.

It's much easier if the SERPs are showing 100 results per page.

You'll see days where the figures are all over the place and other days where they are more in alignment. You'll see that changes happen in batches.

Google hides pages from showing in the SERPs but the count remains high for a few days after.
8:40 pm on Apr 6, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 13, 2004
votes: 12

Something to try:
Do your site: search.
Now go to the address bar and edit the string that Google produced to actually do the search.
After one of the "&parameter=" strings add &filter=0. This may show more results.

The duplicate directory filter is quite a telltale!

Here's one reference that documents the myriad parameters Google may use in a serach

Another with more on filter=

Google search uses two types of automatic filters:

Duplicate Snippet Filter - If multiple documents contain identical titles as well as the same information in their snippets in response to a query, only the most relevant document of that set is displayed in the results.
Duplicate Directory Filter - If there are many results in a single web directory, then only the two most relevant results for that directory are displayed. An output flag indicates that more results are available from that directory.

By default, both of these filters are enabled. You can disable or enable the filters by using the filter parameter settings as shown in the table.

Filter --Filter --------Directory
value ---Duplicate -----Filter
-------- ------------ ---------------
filter=1 Enabled (ON)-- Enabled (ON)
filter=0 Disabled (OFF) Disabled (OFF)
filter=s Disabled (OFF) Enabled (ON)
filter=p Enabled (ON)-- Disabled (OFF)

How well these documents map to what you can do with the address bar? Afraid I have no idea.
10:40 pm on Apr 6, 2012 (gmt 0)

New User

5+ Year Member

joined:Aug 28, 2010
posts: 8
votes: 0

Thanks for the help guys. I think the whole thing is not worth worrying about after all. I submitted a url request through GTW for JavaScript and swf files that were triggering the message...we have omitted some entries very similar to...

Now the message is gone. But the indexed page's count still shows wrong number, 116. I dont think there is nothing much I can do after this.
I guess there is some problem with their system or something. I was afraid of getting labeled as having duplicate content on my website.
8:33 pm on Apr 9, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 13, 2004
votes: 12

Is there any chance you were using:
Mod_PageSpeed on the server side to speed up your site?

If you have any kind of link to a .js file Google may index the file. Mod_PageSpeed plays a trick "hiding" javascript from the browser until after the onload event. Google may be picking up links to these files, and not understanding them, adding them to their index.

I added a link once to a robots.txt file as a training aid. Google immediately indexed robots.txt! And with Panda, especially the Oct 13th tweak, this could look like "poor quality content"!

Oh no, Mr. Bill!