homepage Welcome to WebmasterWorld Guest from 54.205.144.54
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
"site:" operator results in two different figures
atlantis76




msg:3935994
 2:48 pm on Jun 18, 2009 (gmt 0)

Hi All

Once I make a search for "site:mysite.com" when my Google is set to return 10 results per page, I receive "146" results.

Doing the same search when Google is set to 100 results per page yields only 96 results.

In the corresponding sitemap (which I generated with xml-sitemaps.com) there are 94 pages.

Your thoughts and comments will be appreciated.

And by the way, how do you generate your xml sitemaps?

Thanks!
Assaf

 

Receptional Andy




msg:3936242
 7:14 pm on Jun 18, 2009 (gmt 0)

Google operates both a number of data centres (batches of computers serving up the results) and indexes (different databases of search results) - any of these can show different numbers at any given time, and you can be sent to different data centres depending on what you search for.

Additionally, at the point of search, Google determines how many "relevant" results there are for your query - and even a slight change in the method used to search can change Google's judgement of what is relevant at what isn't. You might also notice that the results are ordered very different when you view 100 results instead of 10 - it isn't just the first 10 pages put one after the other.

tedster




msg:3936243
 7:14 pm on Jun 18, 2009 (gmt 0)

The site: operator's number results can be even more confusing than that. For example, if a site has five directories you can query site:example.com/directory1/ and so on. Then add up the numbers for all five and you may see a much higher total than what you get for site:example.com.

In the list of urls for the direectory queries you can see indexed pages that were not returned for the site:example.com query - and they are indexed. The best way I know of to verify a page as being indexed is to type that url directly into the search box. And even then there can be variation at different data centers.

For the webmaster, this can be a game of just getting a close estimate for the total number, and then focusing on individual pages that are important as needed.

tedster




msg:3936245
 7:25 pm on Jun 18, 2009 (gmt 0)

the results are ordered very different when you view 100 results instead of 10 - it isn't just the first 10 pages put one after the other

What I see happening is Google's clustering filter. Using 100 results, there are ten times as many chances to get two urls from the same domain - so a #2 result can be followed by an indented #3 result - which was actually #99 when you used only 10 results per page.

[edited by: tedster at 9:45 pm (utc) on June 18, 2009]

Receptional Andy




msg:3936316
 9:36 pm on Jun 18, 2009 (gmt 0)

Ted, you're absolutely right for the results I'm seeing currently - filter=0 compared to num=100 do indeed give me the same results order at the moment. I have a nagging feeling I've found other differences at some point in the past though. But my memory isn't what it was ;)

tedster




msg:3936325
 9:54 pm on Jun 18, 2009 (gmt 0)

And by the way, how do you generate your xml sitemaps?

I strongly encourage you to pick your sitemap generator from this list at Google Code:

[code.google.com...]

atlantis76




msg:3937419
 6:24 pm on Jun 20, 2009 (gmt 0)

Thanks all, much appreciated.

tedster, you wrote:

The best way I know of to verify a page as being indexed is to type that url directly into the search box. And even then there can be variation at different data centers.

I'm not sure I fully understood: searching for www.mysite.com or http://www.mysite.com yields not only the pages indexed, but also many other pages that contain these phrases.

Can you plz elaborate?

Thanks in advance!
Assaf

[edited by: tedster at 6:33 pm (utc) on June 20, 2009]
[edit reason] de-link the example [/edit]

tedster




msg:3937426
 6:38 pm on Jun 20, 2009 (gmt 0)

Sure. If you wonder whether any url is indexed, then type or paste it directly into the search box. If that url comes back as one of the results, then it is in the index. It doesn't matter that citations of that url on other pages are also returned. If the exact url you typed in is returned, then it IS indexed and that's what you wanted to know.

Note that I'm not just talking about the domain name "example.com". It works for any url - even deep internal urls.

atlantis76




msg:3937434
 6:55 pm on Jun 20, 2009 (gmt 0)

Thanks tedster

My mistake here: I thought you meant that this search can somehow be used to see how many pages of mysite.com are indexed- in order get better approximation than "site:" which results in different figure is different searches, hence, this thread :)

Assaf.

g1smd




msg:3937570
 2:24 am on Jun 21, 2009 (gmt 0)

Comparing the numbers for the 'total' when looking at page 1 when it is set to 10 results per page, and looking at page 1 when it is set to 100 results per page, is futile.

You'll notice that at 10 results per page, that the total changes when you click through to page 2 and changes again when you reach the 'last' page.

I always measure by using 100 results per page, and click through to the last page. I measure twice: with and without &filter=0 appended.

Whitey




msg:3937579
 2:41 am on Jun 21, 2009 (gmt 0)

It may help to go to Webmaster tools and observe the sitemap statistics. Here you can see a total for your sitemap URL's and a total "indexed" to compare, rather than use the site: operator

[edited by: Robert_Charlton at 7:00 am (utc) on June 21, 2009]
[edit reason] fixed formatting [/edit]

Adam_C




msg:3938774
 1:25 pm on Jun 23, 2009 (gmt 0)

for those that measure on a regular basis - have you noticed any anomalies over the last 4-5 days?

I've seen some unusual variations, including several instances of truncated results, when clicking through to the final page of results and seeing something like

"Results 501 - 522 of about 1,380" with num=100 and filter=0 set

[edited by: Adam_C at 1:36 pm (utc) on June 23, 2009]

g1smd




msg:3939135
 11:14 pm on Jun 23, 2009 (gmt 0)

I am getting a lot of "1 - 70 of about 16" (and similar) on sites that have changed their URL structure and now redirect the old URLs to new. That "16" is the number of URLs in the main index. :)

tedster




msg:3939242
 5:26 am on Jun 24, 2009 (gmt 0)

Same here. Looks like something has changed with the site: operator reporting and the data is now more inscrutable than ever.

atlantis76




msg:3939355
 9:59 am on Jun 24, 2009 (gmt 0)

Thanks.

Where can I read more about "&filter=0" ?

tedster




msg:3939674
 6:39 pm on Jun 24, 2009 (gmt 0)

A Site Search [webmasterworld.com] will turn up quite a few threads. For example:

[webmasterworld.com...]

evotsi




msg:3939118
 10:34 pm on Jun 23, 2009 (gmt 0)

< moved from another location >

I have a site that previously had 200,000 pages listed in Google's index via site:example.com. These pages were in Google's index for around 5 years. Within a couple of weeks they dropped down to 2000 pages and have remained steady. The pages still in the index however are ranking very well.

Were these pages penalized in some way? The pages were basically manuals/products lists for a manufacturer whose products we carry. As such they weren't really ever updated, however were very relevant for someone searching for a particular product.

[edited by: tedster at 7:15 pm (utc) on June 24, 2009]

tedster




msg:3939240
 5:23 am on Jun 24, 2009 (gmt 0)

Are you basing your post on the results you see using the site: operator? I'm seeing this kind of site: operator drop on some very high profile websites and I suspect it signals a recent change in how the numbers are being generated.

How about your server logs - actual visits?

evotsi




msg:3939608
 5:17 pm on Jun 24, 2009 (gmt 0)

That's very encouraging, you have been helpful as always. I looked at our web logs and there weren't any major changes there. Also I did a random sampling on 100 pages using site:www.example.com/somepage.html and 99% pulled up, even though they weren't listed in the results with site:example.com or site:www.example.com. So it appears they are still in Google's index, even though it doesn't show with site:.

Adam_C




msg:3940755
 8:53 am on Jun 26, 2009 (gmt 0)

from what I've seen today things seem to have gone back to how they were.

Results seem more trustworthy

Not seeing any more of the previous "Results 501 - 522 of about 1,380" kind of issues

Will keep an eye on this next week

Anyone else got any updates to their situation?

mibrahim




msg:3940992
 2:20 pm on Jun 26, 2009 (gmt 0)

< moved from another location >

Last night, I noticed that the site: operator on images.google.com stopped reporting the number of images found on the site. I tried several sites as well as google.co.uk they were not working and still not working till now.

Does any one know if that's a part of an update or a bug ?

[edited by: Robert_Charlton at 6:29 pm (utc) on June 26, 2009]

tedster




msg:3941219
 7:58 pm on Jun 26, 2009 (gmt 0)

I don't know for sure, but that looks pretty intentional to me, mibrahim. Maybe it's a temporary thing while back end changes are being coded. Those who use image searcxh for marketing purposes certainly will hope.

ianevans




msg:3941168
 6:28 pm on Jun 26, 2009 (gmt 0)

< moved from another location >

Anyone else notice that doing a search on google images no longer displays the number of results returned?

Sort of makes it hard to see how many images on a site are indexed.

[edited by: Robert_Charlton at 8:06 pm (utc) on June 26, 2009]

ianevans




msg:3941576
 12:21 pm on Jun 27, 2009 (gmt 0)

Hmm...it appears the numbering has returned.

tedster




msg:3941810
 1:38 am on Jun 28, 2009 (gmt 0)

I can confirm - the site: result numbers have returned to Image Search.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved