Forum Moderators: Robert Charlton & goodroi
I had 20,300 pages showing for a site:www.example.com search yesterday and for the past month. Today it dropped to 509 but my traffic is still pretty constant. I normally get around 4,500 - 5,000 to that site per day and today I've already got 4,000.So, either Google doesn't account for even a small percentage of my traffic (which I doubt) or the way Google stores information about my site has changed. i.e. the 20,300 pages are still there, Google will only tell me about 509 of them. As far as I can tell, I think the other pages have been supplemented.
That resonated with something that I was talking about with the crawl/index team. internetheaven, was that post about the site in your profile, or a different site? Your post aligns exactly with one thing I've seen in a couple ways. It would align even more if you were talking about a different site than the one in your profile. :) If you were talking about a different site, would mind sending the site name to bostonpubcon2006 [at] gmail.com with the subject line of "crawlpages" and the name of your site, plus the handle "internetheaven"? I'd like to check the theory.
Just to give folks an update, we've been going through the feedback and noticed one thing. We've been refreshing some (but not all) of the supplemental results. One part of the supplemental indexing system didn't return any results for [site:domain.com] (that is, a site: search with no additional terms). So that would match with fewer results being reported for site: queries but traffic not changing much. The pages are available for queries matching the supplemental results, but just adding a term or stopword to site: wouldn't automatically access those supplemental results.
I'm checking with the crawl/index folks if this might factor into what people are seeing, and I should hear back later today or tomorrow. In the mean time, interested folks might want to check if their search traffic has gone up/down by a major amount, and see if there are fewer/more supplemental results for a site: search for their domain. Since folks outside Google couldn't force the supplemental results to return site: results, it needed a crawl/index person to notice that fact based on the feedback that we've gotten.
Anyone that wants to send more info along those lines to bostonpubcon2006 [at] gmail.com with the subject line "crawlpages" is welcome to. So you might send something like "I originally wrote about domain.com. I looked at my logs and haven't seen a major decrease in traffic; my traffic is about the same. I used to have about X% supplemental results, and now I hardly see any supplemental results with a site:domain.com query."
I've still got someone reading the bostonpubcon email alias, and I've worked with the Sitemaps team to exclude that as a factor. The crawl/index folks are reading portions of the feedback too; if there's more that I notice, I'll stop by to let you know.
[edited by: Brett_Tabke at 8:07 pm (utc) on May 8, 2006]
Two or three times a day, I go to Yahoo, MSN, and Ask.com so I can see my previously highest earning, best Google referral URL in the top 10. Because it isn't even in Google anymore since the robots.txt "issue" of April 11-12.
But maybe that's not relevant.
PR isn't important in regards to the issue, no difference in the problem from our pr7 or pr6 site
I completely disagree. I believe PR is absolutely central to the missing pages issue. You have no idea what PR your sites currently have. You may see PR7 for one and PR6 for the other, but in reality the actual PR for these sites might now be completely different. You have no way of knowing.
It's worth bearing in mind a simple fact (if anything can truly qualify as a "fact" in regards to Google):
Missing/Buggy Backlinks => Lower PR => Shallower Indexing => Loads of Missing Pages
- Authority systems seem to be irrelevant. We have scrapers by the tens of thousands weekly crawling and spamming our content out and now, these pr 1 or 0 sites with little more then 1 ibl are crushing our 2 sites in the serps for almost all positions and our company name. This despite having PR7, thousands of IBL's, clean seo, no changes to the properties in years.
LOL, anyone remembers Google's mantra, "organizing the world information" (yea right). A couple of years later and check out what they did to the web. The web today is the largest pile of MFA's junk the world has ever seen. Even Google themselves are choking on this crap.
"Please use our site maps....please put nofollow tags...please send re-inclusion requests...helps us out a little here we are choking on our own spam. Our algo is so severely busted even our best technicians can no longer make any sense of it". That's what i (and i'm sure that many other veterans) can clearly see.
The wheels are coming off Google search boys and girls, time to move on, and don't forget to let your clients know about it.
Start actively promoting other search engines for your own good and better future.
[edited by: walkman at 4:34 pm (utc) on May 10, 2006]
These statements make me think that G has ratcheted up its spam definitions and a good percentage of the newly excluded pages may be ones that are now getting labelled in the index as spam when previously they were considered clean.
But I can show you some sides in top ten positions that have Duplicate Content on different domains, work with doorways, redirects and keyword stuffing. That sides have not been affected to this "filter", instead they gained from the falling out of index from the white head "good" pages.
Yesterday I found a page with 12300 pages in index. ALL of them were filled with keywords and a javascript redirect.
How does it fit into your idea of google having put up the spam defenitions?
Maybe they turned the switch into the wrong direction?
or they did a "not xnor" and canīt find it anymore.
There are to many errors that does not fit into a spam theory: Old pages in index. No new pages in index. To many good pages where hit. Bad pages obviously not...
Pages drop so analytics can run so they can better track adwords and site's trends? That = more $$ for Google.
Anyone else get invite codes today?