Welcome to WebmasterWorld Guest from 54.227.110.209

Message Too Old, No Replies

"site:" operator results in two different figures

     

atlantis76

2:48 pm on Jun 18, 2009 (gmt 0)

10+ Year Member



Hi All

Once I make a search for "site:mysite.com" when my Google is set to return 10 results per page, I receive "146" results.

Doing the same search when Google is set to 100 results per page yields only 96 results.

In the corresponding sitemap (which I generated with xml-sitemaps.com) there are 94 pages.

Your thoughts and comments will be appreciated.

And by the way, how do you generate your xml sitemaps?

Thanks!
Assaf

Receptional Andy

7:14 pm on Jun 18, 2009 (gmt 0)



Google operates both a number of data centres (batches of computers serving up the results) and indexes (different databases of search results) - any of these can show different numbers at any given time, and you can be sent to different data centres depending on what you search for.

Additionally, at the point of search, Google determines how many "relevant" results there are for your query - and even a slight change in the method used to search can change Google's judgement of what is relevant at what isn't. You might also notice that the results are ordered very different when you view 100 results instead of 10 - it isn't just the first 10 pages put one after the other.

tedster

7:14 pm on Jun 18, 2009 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



The site: operator's number results can be even more confusing than that. For example, if a site has five directories you can query site:example.com/directory1/ and so on. Then add up the numbers for all five and you may see a much higher total than what you get for site:example.com.

In the list of urls for the direectory queries you can see indexed pages that were not returned for the site:example.com query - and they are indexed. The best way I know of to verify a page as being indexed is to type that url directly into the search box. And even then there can be variation at different data centers.

For the webmaster, this can be a game of just getting a close estimate for the total number, and then focusing on individual pages that are important as needed.

tedster

7:25 pm on Jun 18, 2009 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



the results are ordered very different when you view 100 results instead of 10 - it isn't just the first 10 pages put one after the other

What I see happening is Google's clustering filter. Using 100 results, there are ten times as many chances to get two urls from the same domain - so a #2 result can be followed by an indented #3 result - which was actually #99 when you used only 10 results per page.

[edited by: tedster at 9:45 pm (utc) on June 18, 2009]

Receptional Andy

9:36 pm on Jun 18, 2009 (gmt 0)



Ted, you're absolutely right for the results I'm seeing currently - filter=0 compared to num=100 do indeed give me the same results order at the moment. I have a nagging feeling I've found other differences at some point in the past though. But my memory isn't what it was ;)

tedster

9:54 pm on Jun 18, 2009 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



And by the way, how do you generate your xml sitemaps?

I strongly encourage you to pick your sitemap generator from this list at Google Code:

[code.google.com...]

atlantis76

6:24 pm on Jun 20, 2009 (gmt 0)

10+ Year Member



Thanks all, much appreciated.

tedster, you wrote:

The best way I know of to verify a page as being indexed is to type that url directly into the search box. And even then there can be variation at different data centers.

I'm not sure I fully understood: searching for www.mysite.com or http://www.mysite.com yields not only the pages indexed, but also many other pages that contain these phrases.

Can you plz elaborate?

Thanks in advance!
Assaf

[edited by: tedster at 6:33 pm (utc) on June 20, 2009]
[edit reason] de-link the example [/edit]

tedster

6:38 pm on Jun 20, 2009 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Sure. If you wonder whether any url is indexed, then type or paste it directly into the search box. If that url comes back as one of the results, then it is in the index. It doesn't matter that citations of that url on other pages are also returned. If the exact url you typed in is returned, then it IS indexed and that's what you wanted to know.

Note that I'm not just talking about the domain name "example.com". It works for any url - even deep internal urls.

atlantis76

6:55 pm on Jun 20, 2009 (gmt 0)

10+ Year Member



Thanks tedster

My mistake here: I thought you meant that this search can somehow be used to see how many pages of mysite.com are indexed- in order get better approximation than "site:" which results in different figure is different searches, hence, this thread :)

Assaf.

g1smd

2:24 am on Jun 21, 2009 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Comparing the numbers for the 'total' when looking at page 1 when it is set to 10 results per page, and looking at page 1 when it is set to 100 results per page, is futile.

You'll notice that at 10 results per page, that the total changes when you click through to page 2 and changes again when you reach the 'last' page.

I always measure by using 100 results per page, and click through to the last page. I measure twice: with and without

&filter=0
appended.

Whitey

2:41 am on Jun 21, 2009 (gmt 0)

WebmasterWorld Senior Member whitey is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



It may help to go to Webmaster tools and observe the sitemap statistics. Here you can see a total for your sitemap URL's and a total "indexed" to compare, rather than use the site: operator

[edited by: Robert_Charlton at 7:00 am (utc) on June 21, 2009]
[edit reason] fixed formatting [/edit]

Adam_C

1:25 pm on Jun 23, 2009 (gmt 0)

10+ Year Member



for those that measure on a regular basis - have you noticed any anomalies over the last 4-5 days?

I've seen some unusual variations, including several instances of truncated results, when clicking through to the final page of results and seeing something like

"Results 501 - 522 of about 1,380" with num=100 and filter=0 set

[edited by: Adam_C at 1:36 pm (utc) on June 23, 2009]

g1smd

11:14 pm on Jun 23, 2009 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I am getting a lot of "1 - 70 of about 16" (and similar) on sites that have changed their URL structure and now redirect the old URLs to new. That "16" is the number of URLs in the main index. :)

tedster

5:26 am on Jun 24, 2009 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Same here. Looks like something has changed with the site: operator reporting and the data is now more inscrutable than ever.

atlantis76

9:59 am on Jun 24, 2009 (gmt 0)

10+ Year Member



Thanks.

Where can I read more about "&filter=0" ?

tedster

6:39 pm on Jun 24, 2009 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



A Site Search [webmasterworld.com] will turn up quite a few threads. For example:

[webmasterworld.com...]

evotsi

10:34 pm on Jun 23, 2009 (gmt 0)

5+ Year Member



< moved from another location >

I have a site that previously had 200,000 pages listed in Google's index via site:example.com. These pages were in Google's index for around 5 years. Within a couple of weeks they dropped down to 2000 pages and have remained steady. The pages still in the index however are ranking very well.

Were these pages penalized in some way? The pages were basically manuals/products lists for a manufacturer whose products we carry. As such they weren't really ever updated, however were very relevant for someone searching for a particular product.

[edited by: tedster at 7:15 pm (utc) on June 24, 2009]

tedster

5:23 am on Jun 24, 2009 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Are you basing your post on the results you see using the site: operator? I'm seeing this kind of site: operator drop on some very high profile websites and I suspect it signals a recent change in how the numbers are being generated.

How about your server logs - actual visits?

evotsi

5:17 pm on Jun 24, 2009 (gmt 0)

5+ Year Member



That's very encouraging, you have been helpful as always. I looked at our web logs and there weren't any major changes there. Also I did a random sampling on 100 pages using site:www.example.com/somepage.html and 99% pulled up, even though they weren't listed in the results with site:example.com or site:www.example.com. So it appears they are still in Google's index, even though it doesn't show with site:.

Adam_C

8:53 am on Jun 26, 2009 (gmt 0)

10+ Year Member



from what I've seen today things seem to have gone back to how they were.

Results seem more trustworthy

Not seeing any more of the previous "Results 501 - 522 of about 1,380" kind of issues

Will keep an eye on this next week

Anyone else got any updates to their situation?

mibrahim

2:20 pm on Jun 26, 2009 (gmt 0)

5+ Year Member



< moved from another location >

Last night, I noticed that the site: operator on images.google.com stopped reporting the number of images found on the site. I tried several sites as well as google.co.uk they were not working and still not working till now.

Does any one know if that's a part of an update or a bug ?

[edited by: Robert_Charlton at 6:29 pm (utc) on June 26, 2009]

tedster

7:58 pm on Jun 26, 2009 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I don't know for sure, but that looks pretty intentional to me, mibrahim. Maybe it's a temporary thing while back end changes are being coded. Those who use image searcxh for marketing purposes certainly will hope.

ianevans

6:28 pm on Jun 26, 2009 (gmt 0)

10+ Year Member



< moved from another location >

Anyone else notice that doing a search on google images no longer displays the number of results returned?

Sort of makes it hard to see how many images on a site are indexed.

[edited by: Robert_Charlton at 8:06 pm (utc) on June 26, 2009]

ianevans

12:21 pm on Jun 27, 2009 (gmt 0)

10+ Year Member



Hmm...it appears the numbering has returned.

tedster

1:38 am on Jun 28, 2009 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I can confirm - the site: result numbers have returned to Image Search.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month