Forum Moderators: Robert Charlton & goodroi
Using site:example.com there were "about 683 pages." But clicking through to the last page, there were only "381 pages" reported. Many of us have been there, I know. Even after clicking on the "omitted results" link, we get nowhere near the total we hoped for, or indeed that seemed to promised on page 1.
So having collected those 381 urls, I was a bit frustrated -- 683 was a lot closer to the reality of the website that I knew. So I decided to use the site: operator directory by directory -- and that way I actually got Google to report almost every one of the original 683 urls!
To put a fine point on this, when I used the query site:example.com, there were only 181 urls returned from directory-a. But when I used site:example.com/directory-a/ 341 urls were returned - an additional 160 urls all in directory-a.
By doing this for every directory in the domain, I managed to find almost all the "missing" urls. Some of them even had decent PR and backlinks. It was good to know that they were really in the index. Two of them even show up in AOL Search, so I assume they must be in the regular index.
then I click to see the last page, but the last page is page 55,
Results 541 - 547 of about 65,000 from mydomain.com
what happened? what even I cannot see the page 99? also not see any omitted result.....
This has been the case for a long time.
I thought it might have been, but I so rarely work on a site that's under 1,000 urls that I couldn't be sure. I figured if I wasn't sure, it was worth a thread.
Do you think this query shows that directories pages in the main index?site:example.com/directory1/*
When I was doing this particular study, the /* hack was giving me the exact same results as the regular search. So I went to AOL to grab the main index urls (and that is tedious!)
All this stuff relates to the false statements Google employees made about supplemental handling a few months ago. "Now you see 'em, now you don't" is the way it works, but even finding them via that site search doesn't mean they can be found for queries.
It's hard to come up with postable examples, but here's one that works currently:
[google.com...]
No www, eh?
All this stuff relates to the false statements Google employees made about supplemental handling a few months ago.
You know, I've read those statements many times - and I suspect now that they are technically true, but misleading. I do think that Google changed something so that now the Supplemental Index is available and "searched on" for every query rather than just for the obscure queries, as was previously the case. That much, I think, is literally true.
And yet, you can search on unique phrases from within the content of a Supplemental URL and see that Google does not return ANY results at all. The Supplemental URL is not returned -- nothing is returned. Now why not, when the unique phrase is sitting right there, observable in the cached page?
My theory is that there's a shortfall in how those Supplemental URLs are tagged and stored for search retrieval in the first place. When the collected data is sharded, and those shards are tagged and stored for retrieval, a Supplemental URL gets an incomplete tagging compared to a URL in the regular index. That's how I see it right now.
So it may be literally true that every search now hits the tags for the Supplemental Index. But since there are fewer tags created in the first place -- call it an "incomplete indexing" if you will, just as was always the case -- a lot of the Supplemental content is still not really, truly accessible via search.
So did Google achieve a technical improvement? Yes, and I'm sure it took a major bit of programming to make that happen. Does this mean a Supplemental URL is now on a totally equal footing with a URL in the main index? No.
And no, the statements were not technically true, unless you presume they were NOT refereing to google search, which is a silly presumption.
They may now look at the pages and choose not to display them, but "look at and decide not to" was not the emphasis. The emphasis was on pages appearing in the results.
What Google has done has made it so supplementals appear for less results than before.
At least before if your page was the only one that said something, it would be returned. Now, they will often not return a page under any circumstances.
Put simply, they have applied greater technology to display less results.
And put more simply, a page being supplemental now is much, much worse off than previously.
What I have been seeing a lot of is where a site:domain.com keyword search returns a few results that are NOT from domain.com, though the problem seems to be going away again as of recent days.
they all have unique, useful content.
How about PageRank, link juice (even from your own site), unique title elements and unique meta descriptions?
Google is really shuffling of a lot of good content into supplemental in recent times, and it's up to use to hsend strong signals for the content we hope will rank.