Forum Moderators: Robert Charlton & goodroi
Seems to me that Matt's recent message confirms my theory. We're either all a bunch of moaning idiots with low quality sites with a few innapropriate, spammy links scattered here and there...or...
The more I think about it the more convinced I am that the missing pages problem is being caused by a Backlink/PR issue (see Msg #15).
Tying together all of the evidence from my own experience, and that of others gleaned from the forums, erroneous or out-of-date backlinks would explain all of the missing pages.The erroneous, or simply out-of-date, backlink information (which we cannot see) leads to insufficient PR (which we cannot see) and hence deep pages are not indexed.
We all know that a "link:www.mysite.com" does not show you the complete picture. But, since Big Daddy, it now shows just a tiny proportion of backlinks. Way less, than it used to show before Big Daddy. Why? Because either the backlink index hasn't been updated (and now dates back to mid 2005), or else because it has been updated, but the update process is buggy. Only a small handful of Google employees know which of these two possibilities is the case.
We know that the missing pages problem cannot be due to any kind of duplicate content filter, as some people are suggesting. If this were the case, then effected sites would see a proportion of their pages disapear. Some would lose 10%, some would lose 40%, and some would lose 95%. But that's not what we see. We see sites losing the vast majority of their pages or else losing no pages at all. The reason effected sites lose such high percentages of their pages is because of the hierarchical nature of a site. The number of pages increases with depth, and the artificially low PRs (based on innacurate and/or out-of-date backlink data) prevents the deeper content from being indexed.
The fact that Big Daddy was kick-started from an index dating back to the middle of last year, not only explains why the backlink data might be stale, but it also explains why ancient pages keep popping up on various data centres.
As further evidence: try a "link:www.mysite.com" and compare it to a search for "www.mysite.com". In my case, the "link:" search shows just 6 results, only one of which is external to my site. The one external backlink probably pre-dates when Big Daddy's index was seeded. The "www.mysite.com" search, on the other hand, finds hundreds of results representing hundreds of internal and external backlinks. Why aren't these showing up in the "link:" search? Is it because "link:" searches are well known for not showing you the complete picture? Or, has that well-known fact simply been obscuring the true cause of all of the problems? Namely, that the backlinks are simply missing from Google's backlink index.
[edited by: tedster at 8:25 pm (utc) on May 17, 2006]
Reseller Wrote:
I guess Matt talks mostly about an affiliate site should have "Value added" content. And not duplicating the content of affiliate program vendors, or just having collection of affiliate links. I.e he is still talking about "Thin Affiliates" .
Another Answer I got:
Don't overreact. The affiliate links Matt used as examples were in the footer and looked like run of site links. He made a point of using terms like "not related." I would think that if your affiliate links are well incorporated and not just scattershot or run of site you *should* be okay."
Based on what you are debating now, was I really overreacting with my post yesterday?
Would it be true to assume that most of you fall within a category described by Matt and you no longer believe google is broken?
Which category is that?
That mom and pop sites don't have enough links?
Or that big sites have too many links?
Or sites that link to other relevant sites are spamming?
Or sites that link to non-relevant sites are spamming?
What a load. If you read MC's answers, EVERY site should be penalized (and it seems thats what's happening. Mom and Pop sites and "authority" sites all the same)
Gimme a break.
They broke the darn thing 3 years ago and are still using electrical tape and ambigious answers to keep it working.
I don't believe Google is broken, far from it. I believe that this is a carefully orchestrated procedure designed to do away with webmaster interference in their serps.
All the Best
Col :-)
I don't believe Google is broken, far from it. I believe that this is a carefully orchestrated procedure designed to do away with webmaster interference in their serps.
As soon as I see the sites that spend thousands of dollars on links from "RadiantRadioBroadcasting" s lose their rankings, I'll believe this.
Until then....naw.. they ain't got a clue
When simple pages like our sitemap doesn't show, when poor results in popular categories take precedence, good backlinks get dropped en mass, when good sites go down the gurgler, when pages are selectively removed for no good reason, and all the rest i read we know this brave step by Google is not producing the intended outcome in quality.
And that means poorer results.
There's too much disruption and collateral damage to say that this transition is "stable".
But there is one important thing - Matt and his team are seeking to open up a lot more about what makes things tick at the 'plex . However, my concern is in the interpretation of those notes which makes management of the task so difficult - hence these discussions.
I gotta say it - big isn't always beautiful. Organisations get a lot less flexible in their mid term and maturity phases.
....and this is a fast moving World and if search isn't working properly, the market could disburse amongst SE's as users choice will widen in time, due to the narrowing of produced results.
1) Exchanging links with other sites = bad signal
2) Linking to other related sites = bad signal
3) Linking to your own sites = bad signal
4) Linking to affiliate programs = bad signal
5) Getting too many incoming links (aka being popular) = bad signal
6) Having your content scrapped by other sites = bad signal
7) Using templates = bad signal
8) Not using site maps = bad signal
9) Using site maps = supplemental hell = bad signal = slow death
10) Using adwords = ignore all points above = GOOD $ignal = Google’s own targets met.
Did I miss something? Feel free to add your own points.
Boy this search engine has turned into a sour joke. A bully with a capital B holding 75% of the web for ransom. Their new motto, “Google are the only ones allowed to prosper over the web. Penalize and choke the hell off the rest, unless they pay up via Adwords off course. They are no longer spammers if they do, they are advertiser$”
Things you can do to help tip the scales in your own favour:
1) Link to other search engines at the very top of every page you have online
2) Let your clients, friends & family know about the new Google. Recommend OTHER search engines.
3) Warn your clients, friends & family about the SPYBAR.
4) Look into alternative advertising programs. STOP supporting this monster.
5) Look into alternative contextual networks and implement ASAP (ditch Adsense – you’ll only sleep better at nights).
6) Did I mention STOP supporting this bully? you are shooting yourself in the foot if you do?
I have a 4 year old 300+ page mom & pop e-commerce site that until three days ago did well for us with Google search. About two years ago I decided to go with the philosophy that if I wanted to do well in the rankings, it may help to do a bit of extra work on my webpages. We hand-produce our product, which takes time, but I chose to spend as much time designing each webpage as we do designing each product. I would hand type each page, with extra info regarding the history, scientific info, etc. on the product.
Up until 3 days ago each page was immediately picked up and did well in Google. Then, overnight 1/2 of my website went supplemental or was de-indexed. I think at present, I have only about 50 pages left. I am assuming from Matt Cutt's blog that my 'problem' is not enough 'quality' inbound links. No spam, no cloaking- not even any Javascript, not one reciprical link on the whole site.
Well Matt, being a small mom and pop e-commerce site, which 1,000 to 2,000 of my competitors should I get a 'quality' link from?
I do not think Google owes anyone a living, nor do I expect to ever be on page one of any Google search either, but to de-index a site that is just 'too small to worry about' seems heavy-handed.
Then they got too big. Reviews started taking a long time to happen. So what did they do? Started charging a ton of cash just for the CHANCE to be reviewed. So less quality sites who couldn't plop down the $600 or so (I forgot the number) stayed away from Yahoo, and slowly their database became less and less relevant with search results. You get a lot of the same listings if you search for certain terms frequently, and the ones that were added weren't really all that good.
In the background, google is forming this algorithm that tries to give you the most relevant search results possible, and constantly updating it. They have this catchy name, and crystal clear home page. It was a no nonsense site, just search and get results, exactly what people wanted. Not only that, results were constantly updated! I mean it was possible to search for stuff that happened just a day or two ago and get results. The results were becoming so good, even Yahoo used them as their 'search partner' (if I remember correctly).
So Google started exploding because of what I mentioned above, and it seemed to only get better. It took me a long time, but even I stopped using Yahoo a few years back because the results from G were so much better.
With all that said, is Google slowly reaching the Yahoo point? Getting just too big for itself to handle? This recent bigdaddy thing is a complete joke, and unless they fix it soon it has the potential to put a huge hurt on their site, especially if out of the blue another SE comes out with better results (MSN possibly?). I don't even need to repeat why it's a big joke, it's been summed up quite nicely through this thread.
But to give a good example, I have a review site where I review products. I put a lot of work into my reviews, it's not just a one paragraph summary. I have 21 reviews right now, and 11,790 words (561 per review), so there is a lot of work and detail that goes into it. When I do a search for a specific product that I was once listed for, but now removed, 6 out of the top 10 listings have absolutely NOTHING to do with the product. They're those spam pages that lead you to another page which you have to search on. That is absolute GARBAGE in my mind. How can a quality site like mine not even be listed, and those 6 junk sites be in the top 10?
With results like that, it won't be long till Google takes the Yahoo path and gets to start looking up at another search engine because they were too slow at correcting clear mistakes with their engine.
Sorry for the rant, I was really bothered before I posted, and now i'm far worse after actually doing a search on that product.
[edited by: vanillaice at 5:23 am (utc) on May 18, 2006]
-- A few day ago I found out that one of my sites, new (8 months old, never seen such a thing like 'sandbox', unique content, only just 3 inbound links that I am aware of, quality links though, 1 from an own wellestablished site), which had made it to ~25000 pages indexed until last week, dropped down to 710 pages. Yikes! and welcome to your club, thanks.
Today, in the morning, it is back with 11200 pages in the index.
-- Yesterday, the SERP count for a single technical niche keyword I check almost daily, dropped from 2.2 millions (steady over the past months) down to 1.3 millions. Today it is 1.8 millions.
Just seen so on 'my' default www.google.com, didn't check different datacenters and can't tell which DCs I saw.
Anyway, its big fluctuation around, so you may re-check your page counts and rankings again today. Hopefully you will find yourself on the up-move, too.
HAND and kind regards,
R.
Regarding is G broke or not? I still think it has some crawling problems and the serps I've looked at is still showing a lot of dead links & very very spammy pages
the daft thing is the DC that dropped me yesterday was 66.249.93.104 which is now also showing all my rankings back today the same as the current DC 64.233.183.104
why it happened god only knows, but least I'm back now all I need is a complete crawl and hopefully the other terms that have been dropped will return
"why it happened god only knows, but least I'm back now all I need is a complete crawl and hopefully the other terms that have been dropped will return"
You may wish to give what Matt said yesterday a thought!
- "If your site has very few links where you’d be on the fringe of the crawl, then it’s relatively normal that changes in the crawl may change how much of your site we crawl."
- "If no one ever links to your site, that makes Googlebot less likely to crawl your pages."
>very few links
makes you wonder whats his/there definition of few links as I do have some just not a lot as I'm very fussy about who I link to
"my vanished rankings had nothing to do with LP as a friend thats in a similar market with a much higher LP also suffered although one thing I did notice yesterday was the lack of the usual affiliate sites that dominate the top places on the serps, those have also returned - so maybe G just turned the filter knob back it aback - I don't know"
I have noticed Google is giving affiliates some love since yesterday. Thanks GOOG :-)
">very few links
makes you wonder whats his/there definition of few links as I do have some just not a lot as I'm very fussy about who I link to "
Let me put it like this:
One PR9 BL will for sure "encourage" the bots to visit your site :-)