Forum Moderators: open
I won't give a specific example, but I can tell you that one site had pages dropped from the index every day for two weeks until it was down to just the front page. The PR is intact (and it is a high PR site).
This is clever by Google, because it makes it much harder for webmasters to complain. Firstly their site *IS* technically still in the index, although only a single page. Secondly, the PageRank is still there, so no more claims of PageRank zero penalty. Thirdly, the pages were removed over time, so no sudden drop offs to look like a hand edit. Google just got sneaky.
Be ready to see more sites disappearing in this way; this is the new Google penalty I think.
And although I agree that they don't need to be sneaky, it isn't so good for them to be seen deleting sites from the index; it is an admission that their algorithms aren't working well enough.
At least for the "offending" sites their front pages are left in and the PR remains intact. That is so much better than being wiped from the index completely.
My site has disappeared from google results for some of my most popular search phrases over the last couple of months. There are a few phrases which used to display my site at the top, but now one of my main competitors is first and I'm not on any of the results pages.
The reason is probably that I used to use PHP to hide Alt Text on images from Google. This is because I thought alt text looked messy in google results page snippets when images were placed in between text on the page. I did this some time ago but now realise it's probably best to display everything to Google Bot as well as users, so I removed the coding a while ago.
What is really annoying is that I was not doing it to increase my ranking or anything like that. However one of my main competitor's sites which has now taken my position on the popular phrases has a Doorway page which I have been reporting regularly for about a year now, and Google has not taken any notice of my spam reports about them at all, and the doorway pages are still on Google.
This really takes the mick. My traffic has fallen a great deal.
Some pages of my site and the front page which is in the directory do show for some of the rare search phrases instead of the popular ones.
I just hope that Google soon realise I have updated the site and that it's absolutely fine and put me back in the results like it used to be :(
[edited by: Richie at 10:19 pm (utc) on Feb. 2, 2004]
And yes, it does imply manual removal, which is why I belive that Google don't want you to see them doing it. It isn't good for their reputation as algorithm masters to be seen doing hand deletions.
I don't have a problem with this at all - at least they have the decency to keep the front page in so that people looking for the site can find it. And if the PR is genuinely there (I don't know yet), then the massive amounts of hard work to get that PR may not be wasted.
>And yes, it does imply manual removal, which is why I belive that Google don't want you to see them doing it.
Sorry, but this is 180 degrees wrong. It implies automatic removal. Your theory implies that every day a googletech goes in and deletes one more page from the site -- when a REAL person would obviously delete the whole domain the first day, and save the next day for nuking other spam domains. No, you'll forget this whole idea if you ever take the tinfoil hat off.
What could REALLY cause this problem, bearing in mind that Google hasn't hired all the keyboard-punchers in India to hand-delete the 8 billion names of WidgetNow.com?
What you might be seeing is the death spiral of a starving pagerank black hole. The google algorithm might automatically pass over the 3 billion listed pages, deleting the least-desirable pages (from some modified pagerank or hilltop or bad-neighborhood point of view). And as each page disappears, the total energy of the black hole shrinks, and it's more likely to be whacked again the next day.
But if you want to believe Sergei Brin spends his lunch hour whacking your website page by page, I can't stop you.
[edited by: Marcia at 4:37 am (utc) on Feb. 3, 2004]
[edit reason] Widgetized to make generic. [/edit]
Further trying to second-guess the google programmers (as opposed to the manual-page-trimmers in the black helicopters). The task of spotting bad neighborhoods is conceptually simple, but computationally complex. We've seen something that looks like "Hilltop" ROLLING IN in over the most common searches. Perhaps the "Neighborhood Watch" has been implemented in a similar fashion -- rather than taking a snapshot of the world looking for slums, a rolling team of inspector-bots (dropping out of black helicopters, if you must) seeking out artificial linking patterns in high-visibility areas (which are, of course, also near the most-common-search phrases).
Again, if this is true, using subdomains (rather than deeplinks) for doorway pages may be risky. Since Google treats subdomains like separate domains, subdomain doorways may look like separate doorway domains, triggering the neighborhood watch on something that could have been just a simple well-linked domain. New (and therefore inbound-link-deprived) legitimate sites that followed the latest SERP-perp fads might be affected.
OK, I'm done. You can have the tinfoil hat back.
<yawn>
What were we discussing?
Hmm. I deleted just three outbound links after Florida, and asked those sites to remove mine as well. I feel more each day that those were good links to lose. Not real bad ones, but not real good ones either.
Plus when I was a caveboy there were lots of quicksand pits around, so it's easy for me to understand this whole "stay away from bad neighborhoods" thing. Never know when you might get sucked down and lost forever.
;-)
I have a web that has been banned from the master pupular search, but if you add to this 2 words a third like "free" or "cheap" or "buy", my page is still at the top.
Sorry for my english, i usually speak catalan.
I manage some web sites and what I have noticed is that the sites aren't ranking for the words the pages were optimized for. Most sites are pr5 some pr6. The sites are built with kw in title, description, H1 and in the text. Maybe 10% density. Site maps with the kw as the anchor tag. All the sites have a lot of content. The sites ranked well until Austin. The keywords that they ranked for are now buried. What is weird is they rank for other kw combos somewhat related. If a site has a title of 4 or 5 words the page will come up for word 1,2,5 combo or 2,4,5 combo but not 1,2,3 combo which is the phrase that is targeting. Would you recommend getting new domains or try and deoptimize the exisiting sites. I did a test with a page that gets crawled by the freshbot by adding more words to the title and breaking up the kw in the h1 and description but that didn't change anything. It's obvious google has put a hex on the keywords but not the site. I also noticed many sites in google don't have descriptions that rank high. I was thinking about removing the descritpions and h1s. Need to figure how to get back those kw rankings.
Kevin
And *it is deliberate*. This is not an algorithm byproduct.
Kevin1023, again that sounds like the Florida/Austin effect and is not what we are talking about here. All you are seeing is the normal Google algorithm at work.
WHAT I HAVE SEEN
1) Pre-Austin I was a PR8 site, with 1400 links ....
2) At the start of this - I dropped down to like 500
links. I stayed PR8. However - my search results
moved as if they were PR6 or close to it.
3) About a week later - I was showing a PR6.
4) More links removed - now down to 1 link.
5) More search results dropped. As if I am a PR4
site or similar.
6) Sub pages dropping from Google Cache like flies.
As if they are a PR2 - not the PR4 it now shows. I
am betting my PR will drop to a PR4 on my home page
very soon.
7) If you do a search for sites that contains a link
to my URL - www.myurl.com - it shows 2900 pages.
8) No spider activity since the start of Austin.
9) Noticing spider activity a week ago - mostly PR4
or PR5 sites.
10) Noticing spider activity Sunday night - mostly
PR5 or PR6 sites.
-----
All this said - it seems to me that
THEORIES
1) Google is trying to re-do Pagerank - and to do
that - some top sites are being brought down to 0, so
no pagerank is passed on.
2) That when the next Link count / spider / crawl occurs - the old link exchanges with enough other PR
applied to them - should show back up again. (A guess
on my part)
3) Based on that your high PR sites will be recalcualted to a new PR - probably less than before.
4) Your dropped cache pages will then be re-spidered
and indexed.
Leaving you - hopefully - more or less like before.
This is PURE THEORY on my part.
Any thoughts and ideas would be appreciated.
I don't think Google would do something like this deliberately, althought it probably would not surprise me!
I just can't see the logic behind wiping out a fair portion of their index, the end result would be a less concise collection of web pages than their competitiors. Seems a bit like censorship to me.. and although Google would have the right to do that, it certainly wouldn't help their image.. My attitude would be "Hey buddy, you do your job and index the pages, I'll decide what I wanna click on or visit!"
How are you guys determining the pages lost, just by doing a site: search. Do the number of pages returned reflect what *should be* indexed or do they indicate the total previously indexed minus the ones removed?
Google is only doing exactly what I would do to solve the problem. There are a few issues that they have successfully dealt with:
1. They need a solution to undesirable sites in the index.
2. They don't want to be accused of attacking webmasters.
3. The algorithms probably aren't sufficient to deal with the issues, so they have to make certain decisions through editors.
4. They don't want to be seen to be using manual deletions.
This new penalty solves all the issues in an elegant manner. The offending site is still there, but with just a single page in the index. Webmasters can't really argue with that.