Forum Moderators: Robert Charlton & goodroi
Seems to me that Matt's recent message confirms my theory. We're either all a bunch of moaning idiots with low quality sites with a few innapropriate, spammy links scattered here and there...or...
The more I think about it the more convinced I am that the missing pages problem is being caused by a Backlink/PR issue (see Msg #15).
Tying together all of the evidence from my own experience, and that of others gleaned from the forums, erroneous or out-of-date backlinks would explain all of the missing pages.The erroneous, or simply out-of-date, backlink information (which we cannot see) leads to insufficient PR (which we cannot see) and hence deep pages are not indexed.
We all know that a "link:www.mysite.com" does not show you the complete picture. But, since Big Daddy, it now shows just a tiny proportion of backlinks. Way less, than it used to show before Big Daddy. Why? Because either the backlink index hasn't been updated (and now dates back to mid 2005), or else because it has been updated, but the update process is buggy. Only a small handful of Google employees know which of these two possibilities is the case.
We know that the missing pages problem cannot be due to any kind of duplicate content filter, as some people are suggesting. If this were the case, then effected sites would see a proportion of their pages disapear. Some would lose 10%, some would lose 40%, and some would lose 95%. But that's not what we see. We see sites losing the vast majority of their pages or else losing no pages at all. The reason effected sites lose such high percentages of their pages is because of the hierarchical nature of a site. The number of pages increases with depth, and the artificially low PRs (based on innacurate and/or out-of-date backlink data) prevents the deeper content from being indexed.
The fact that Big Daddy was kick-started from an index dating back to the middle of last year, not only explains why the backlink data might be stale, but it also explains why ancient pages keep popping up on various data centres.
As further evidence: try a "link:www.mysite.com" and compare it to a search for "www.mysite.com". In my case, the "link:" search shows just 6 results, only one of which is external to my site. The one external backlink probably pre-dates when Big Daddy's index was seeded. The "www.mysite.com" search, on the other hand, finds hundreds of results representing hundreds of internal and external backlinks. Why aren't these showing up in the "link:" search? Is it because "link:" searches are well known for not showing you the complete picture? Or, has that well-known fact simply been obscuring the true cause of all of the problems? Namely, that the backlinks are simply missing from Google's backlink index.
[edited by: tedster at 8:25 pm (utc) on May 17, 2006]
Post Florida we all noted that directories and sites with many outbound links did better than those with none. I added links to authority sites on the same topic as our site on the home page and throughout the site. We have stuck at #1 since. But on some of the Big Daddy DCs we have disappeared.
I wonder if the algo is looking for unusual out link patterns associated with large volume search terms. Perhaps if the terms that your pages used to rank for have enough volume to produce a graph in "Trends" then they are looked at more closely.
If I were trying to improve search quality I would focus on the terms that folks actually search for. No one is going to complain about crap serps for terms that they don't use.
Sid
The theory about crap in-links could, I’m afraid to say fit in quite accurately on our story. Being all new, with a great and serious idea though, we where referred to swaps and directories and saw a good climb in indexing and visitors. Quality links were also coming in as people discovered the benefits of our site. Now, starting a few steps back again, we are committed to get some quality in-links.
If BD is successful in adding relevance to search results, I’m all for that change. Our idea is built on relevance and we will succeed, but still, there have to be some way to get in the system, being new and wonderful. ;)
but what if lets say you have a book review site
you build a solid site with the bulk being your own content then you theme the books into sections say romantic, sci-fi and so on
then from a well know book affiliate you use snippets ( not the whole review) to preview the books which you then build pages around this say 10 books per page, with each book linking to the affiliate to buy it and within these pages would be your own content
would that in your view be just "another" affiliate site ?
I think I must be the unluckiest site owner out there. I don't link to non relevant sites. My site isn't part of a link network. I do have links from some well established relevant sites online. I do have some links from sites which are non relevant - but I don't control them so not much I can do there.
Yet I look at online competitors and they still have cloaking pages, JavaScript redirects and are obviously part of link exchange programs as they link from widget sites to hotels etc.
So I sit here with 90% of my site content still not indexed (slight improvement recently). Its one battle to get it ranked high in the index - but should it be so hard to get it included in the first place? I thought Google wanted to index the 'whole' web.
I haven't had a reply from the gmail address and the way things are going - if I did it would probably be caught by my email spam filter and I wouldn't get to read it.
If BD is successful in adding relevance to search results, I’m all for that change. Our idea is built on relevance and we will succeed, but still, there have to be some way to get in the system, being new and wonderful. ;)
According to Google's new way of thinking, you need to have a large number of "natural" inbound links or they won't even index you. Natural links do not include bought links, reciprocal links, links from bad neighbourhoods, or links that they deem to be off-topic for your web site.
I was explaining this to my two year old daughter when she stopped me short and pointed out the fatal flaw:
You need natural links or we won't list you. How do you get natural links? People find your site, like it a lot, and therefore add a link to your site from theirs. How do these people find you in the first place? Well, on a search engine of course...oops!
but my concern is the affiliate side of this as "if" I'm reading the blog correct all affiliates sites are going to get hit in some way or another depending on how much content you use from the affiliate - so if you have this book type review site you've had it as you won't be able to use any content
This was not a brilliant throw to put all sites in one basket. I thought google engineers would be more slier. Instead of looking at the real spammers they wipped out all but the big ones.
Besides I saw a site with 2400 backlinks all from one forum. That forum is driven be themselves. They are #1 for mayor keywords, thats the reality.
But not to forget about the other techhnics. Keyword stuffing works fine, cloacking works fine, javascript redirect works fine. It time to change way of thinking if google does not remain true to themself.
Last but not least has google forgoten to think about the niches where it does not remunerate to participate on adwords or to register at a price comparison site. There is a big leak of thinking. If it goes on like this we don´t need google anymore because there are only some few well known pages in index that you can trigger without google.
Just my opinion,
regards
LG
But the daft thing is that the big ones that have dropped have got loads of unique content on their pages and have not simply ripped content from the affiliates they are promoting.
It almost seems as though G have added a penalty for sites that have any form of affiliate code in their outbound links ie. affid=? partnerid=? etc.
Ridiculous... and hypocritical. Why is Google trying so hard to control what webmasters consider valuable? If this is supposed to cut down on spam, why are so many spam sites thriving while small sites get wiped out? So much for letting a new site naturally develop links.
I don't rely on my site to generate income, so I'm not going to bother trying to please Google. Guess I'll stay in supplemental hell forever for my lack of unnatural efforts. Oh well - at least my site is still high ranking on Yahoo and MSN. Perhaps I will put links on my site to them. (GASP! Such irrelevance!)
Google used to be the engine you turned to because it had the most indexed pages and you were bound to find something, even if you had to fish for it. Now it's reversing that and removing indexed pages, and from Matt Cutt's blog, apparently they think this is a terrific idea. Reminds me of the New Coke fiasco in the mid-80s (Coke is the most popular soda - let's change how it tastes!) We all know how that went over.
Perhaps Google believes that no one will notice. This is probably true for generalized terms. But when a user searches for specific content and gets few or no relevant results in Google, they are forced to go elsewhere. It's as simple as that. And that is how the big guys go down - forgetting that no matter how long you've been on top, you have to keep up the quality, or you won't stay there. At least Bill Gates should be happy.
I can only hope this is spin and not really the way they have structured BigDaddy. Surely these high-paid people at least have common sense.
The spam sites that are in my niche haven't suffered at all, while I've had my results cut way down. I've seen the same thing with a couple of friends - they've been impacted even worse than I have.
It could be that people are stealing my content, and I'm the one getting penalized, and I'll have to look into that this weekend, but that doesn't help me out much - I can work with Google and get them delisted, but I doubt Google will give me any kind of special re-index - my sites only handle 50,000 or so people a month - a drop in the bucket.
In addition to my websites, I have a small side service that I use AdWords to advertis - I'll move that AW money over to another advertising service, and I'll switch out my AdSense ads on my website to other services as well. If Google calls or emails like they have in the past when I've made such changes, I'll let them know, but they are seperate entitites from the Google SE gods.
Google may not notice, but I will feel like I have done something :-)
Simple.. are you just a store with books or are you a store with books, community forums, book blogs, User hand written reviews of the books, reviews of the best libraries, etc. If not, you better win by having the best natural links from the most powerful sites to offest your dup content/affiliate link limited site.
You wont be hit by anything, you will just have less chance of ranking unless you get better links (naturally ofcourse). You dont get penalized you just get looked at as less important.
because some spammers know how to acquire "high quality" IBLs.
This is very misleading. While the statement is accurate for some of course, the majority of Spammers are thriving because they have aquired networks of sites they can manipulate the links from. These are not quality links, they are just not reciprocal. Over the past year Googles ability to spot networks has weakened considerably.
It almost seems as though G have added a penalty for sites that have any form of affiliate code in their outbound links
I hope your right but I'm not holding my breath
Regarding my book posting
My site has a good mix of my own content mixed in with affiliate information, but as in my book example the content provided by the affiliate still only amounted to 40% of the content the rest was my own and these affiliate pages where 3rd & 4th level pages
I've been some further research across a wide selection of keywords and I'm sure this latest twist of the knife is aimed at affiliate sites as I'm struggling to find any
I run or am associated with numerous sites.. some affiliate oriented some not.. What seems to hurt the ones that lost pages the most is the fact that backlinks where hit and therefore deep indexing was stopped or slowed. It is just that most affiliate oriented sites tend to be connected one way or another to bad links.
Big Daddy = Quality Control Check
On the 9th May I did a 301 redirect from page.htm to a another of my sites as follows:
RewriteRule ^page-a\.htm$ [domain.com...] [L,R=permanent]
What i overlooked was that over 600 other pages link to page.htm. I now find that many have gone missing from the Google index. I can't be sure when they dropped out, but i will assume it was over the last week. I have now removed the Redirect to see if they come back.