I'm inclined to agree.... or at least say that you have an interesting point. But I don't think they would bother being sneaky about it. They can do what they like and don't need to worry about being sneaky
Not necessarily sneaky, but of course who knows. Could just be part of the extension of the FL filters via Austin... things are still happening.
This started well before Austin, but I didn't want to post anything until I was certain.
And although I agree that they don't need to be sneaky, it isn't so good for them to be seen deleting sites from the index; it is an admission that their algorithms aren't working well enough.
At least for the "offending" sites their front pages are left in and the PR remains intact. That is so much better than being wiped from the index completely.
i have seen that happening as well with sites using extensive link farming and/or getting all the PR from a link farming network (triangle), initially the PR remains and pages drop slowly but steadily and after 5-6 months the PR goes to 0 as well.
I think rather than being sneaky they give ample time to people to mend their ways.
P.S : diversified linking that reaches out to thematic content from sparse locations helps in survival though
it's a smart idea
I agree, and it isn't the first time that Google would have made it hard to see a penalty in action.
When the 'can have PR but not pass it on' thing started, the high PR page still showed in link: searches if the destination had PR4 or more from elsewhere.
I am not certain Google would engage in this type of cat and mouse game. Regardless how a penalty is carried out, it will be identified over time.
Unless I misunderstand (which happens from time to time)...there seems to be an assumption in here that keeps me from buying this one: Wouldn't this imply manual penalization?
I doubt that G would do that for something like this...
>Wouldn't this imply manual penalization?
I'd say more natural penalisation over time, unless the algo is shown a good reason to think otherwise.
I agree with this and have come across it.
Are these vanishing pages completely removed from the index? Or can they be found with URI seaches, odd combo's of on page kw's, etc...?
I tried to post about this a couple of days ago but I don't think it ever appeared.
My site has disappeared from google results for some of my most popular search phrases over the last couple of months. There are a few phrases which used to display my site at the top, but now one of my main competitors is first and I'm not on any of the results pages.
The reason is probably that I used to use PHP to hide Alt Text on images from Google. This is because I thought alt text looked messy in google results page snippets when images were placed in between text on the page. I did this some time ago but now realise it's probably best to display everything to Google Bot as well as users, so I removed the coding a while ago.
What is really annoying is that I was not doing it to increase my ranking or anything like that. However one of my main competitor's sites which has now taken my position on the popular phrases has a Doorway page which I have been reporting regularly for about a year now, and Google has not taken any notice of my spam reports about them at all, and the doorway pages are still on Google.
This really takes the mick. My traffic has fallen a great deal.
Some pages of my site and the front page which is in the directory do show for some of the rare search phrases instead of the popular ones.
I just hope that Google soon realise I have updated the site and that it's absolutely fine and put me back in the results like it used to be :(
[edited by: Richie at 10:19 pm (utc) on Feb. 2, 2004]
The pages are completely removed from the index - just the front page remains.
And yes, it does imply manual removal, which is why I belive that Google don't want you to see them doing it. It isn't good for their reputation as algorithm masters to be seen doing hand deletions.
I don't have a problem with this at all - at least they have the decency to keep the front page in so that people looking for the site can find it. And if the PR is genuinely there (I don't know yet), then the massive amounts of hard work to get that PR may not be wasted.
I don't think that is the same problem at all - yours sounds just like the Florida/Austin effect and not a penalty.
The pages simply aren't in google if you have had the penalty; just the front page. You can check the number of pages by doing a search for 'site:example.com -blahblahblah'.
Oh. Thank you Bobby_Davro. I haven't read any of the post about Austin or Florida.. but will now.
Ive been reading this with great interest I noticed a major change in my rankings about 1 week ago- thinking it was maybe just a switch to another algo I have not taken any action yet but Im beginning to wonder now.... As a snewbie to this site please could someone point me somewhere or explain what the Florida/Austin effect is
try this one:
read, and read and read.....
>The pages are completely removed from the index - just the front page remains.
>And yes, it does imply manual removal, which is why I belive that Google don't want you to see them doing it.
Sorry, but this is 180 degrees wrong. It implies automatic removal. Your theory implies that every day a googletech goes in and deletes one more page from the site -- when a REAL person would obviously delete the whole domain the first day, and save the next day for nuking other spam domains. No, you'll forget this whole idea if you ever take the tinfoil hat off.
What could REALLY cause this problem, bearing in mind that Google hasn't hired all the keyboard-punchers in India to hand-delete the 8 billion names of WidgetNow.com?
What you might be seeing is the death spiral of a starving pagerank black hole. The google algorithm might automatically pass over the 3 billion listed pages, deleting the least-desirable pages (from some modified pagerank or hilltop or bad-neighborhood point of view). And as each page disappears, the total energy of the black hole shrinks, and it's more likely to be whacked again the next day.
But if you want to believe Sergei Brin spends his lunch hour whacking your website page by page, I can't stop you.
[edited by: Marcia at 4:37 am (utc) on Feb. 3, 2004]
[edit reason] Widgetized to make generic. [/edit]
More speculation about this. We know googlebot is doing more frequent runs on "frequently updated" sites. Suppose that when a page is purged for being deemed "unworthy of continued listing because it's in a slum" -- then all the pages that POINT to it have their "bad neighborhood" value increased, AND ARE MARKED "CHANGED." This would cause the whole black-hole, um, website to be marked "high rate of changes, come back tomorrow". And the next day the city housing inspector comes back, checking the house next door.
Further trying to second-guess the google programmers (as opposed to the manual-page-trimmers in the black helicopters). The task of spotting bad neighborhoods is conceptually simple, but computationally complex. We've seen something that looks like "Hilltop" ROLLING IN in over the most common searches. Perhaps the "Neighborhood Watch" has been implemented in a similar fashion -- rather than taking a snapshot of the world looking for slums, a rolling team of inspector-bots (dropping out of black helicopters, if you must) seeking out artificial linking patterns in high-visibility areas (which are, of course, also near the most-common-search phrases).
Again, if this is true, using subdomains (rather than deeplinks) for doorway pages may be risky. Since Google treats subdomains like separate domains, subdomain doorways may look like separate doorway domains, triggering the neighborhood watch on something that could have been just a simple well-linked domain. New (and therefore inbound-link-deprived) legitimate sites that followed the latest SERP-perp fads might be affected.
OK, I'm done. You can have the tinfoil hat back.
What were we discussing?
Nice hutcheson ... now there's a theory I can buy. Scary though.
Hmm. I deleted just three outbound links after Florida, and asked those sites to remove mine as well. I feel more each day that those were good links to lose. Not real bad ones, but not real good ones either.
Plus when I was a caveboy there were lots of quicksand pits around, so it's easy for me to understand this whole "stay away from bad neighborhoods" thing. Never know when you might get sucked down and lost forever.
Google only uses the new spam catcher algorritm for the most popular searches, not in every search, and it's done automatically.
I have a web that has been banned from the master pupular search, but if you add to this 2 words a third like "free" or "cheap" or "buy", my page is still at the top.
Sorry for my english, i usually speak catalan.
I manage some web sites and what I have noticed is that the sites aren't ranking for the words the pages were optimized for. Most sites are pr5 some pr6. The sites are built with kw in title, description, H1 and in the text. Maybe 10% density. Site maps with the kw as the anchor tag. All the sites have a lot of content. The sites ranked well until Austin. The keywords that they ranked for are now buried. What is weird is they rank for other kw combos somewhat related. If a site has a title of 4 or 5 words the page will come up for word 1,2,5 combo or 2,4,5 combo but not 1,2,3 combo which is the phrase that is targeting. Would you recommend getting new domains or try and deoptimize the exisiting sites. I did a test with a page that gets crawled by the freshbot by adding more words to the title and breaking up the kw in the h1 and description but that didn't change anything. It's obvious google has put a hex on the keywords but not the site. I also noticed many sites in google don't have descriptions that rank high. I was thinking about removing the descritpions and h1s. Need to figure how to get back those kw rankings.
I can asure you that nobody was suggesting that Google techs remove pages by hand. What I am suggesting is that they have a script to remove a bunch of pages from a selected site each day over a period of time. This is not a tricky thing to code.
And *it is deliberate*. This is not an algorithm byproduct.
Kevin1023, again that sounds like the Florida/Austin effect and is not what we are talking about here. All you are seeing is the normal Google algorithm at work.
the front page is the only one that remains and even then for a site named "unethicalwebmaster.com" a search on unethicalwebmaster will not show this site ...howsoever unique the phrase might be ... plus after a time the sites that do even a one way link to it or exist in the same neighbourhoods as this site, (in a way isolated from the web but well connected in its huge network) start getting affected .."google cancer"
an addition to that is ...
1-)google looks for a seed url or a seed website ( as yet not sure) to penalize from a network which is existing in isolation and extensively crosslinks..
2-)keeping the links page changing every few day helps
3-)Immediate neighbour patterns are seen and analyzed to check repetitiveness across the network ( this has some role in chossing the seed url as well)
Random Thoughts :
This has somehow an uncanny resemblance to image processing techniques ....
take the case of opening closing operations as an analogy .. shrink and expand the network by one level(removing highest PR giver and then include it ) if there is no distortion in incoming and outgoing links it means it is heavily SEO ' ised and extensively cross linked ... not computationally extensive but memory extensive yes..keeping a PR assigned to each website ..
the shrink and expand could also account for why pages fall gradually ...
Let me tell you what I have seen - let me know your thoughts and ideas.
WHAT I HAVE SEEN
1) Pre-Austin I was a PR8 site, with 1400 links ....
2) At the start of this - I dropped down to like 500
links. I stayed PR8. However - my search results
moved as if they were PR6 or close to it.
3) About a week later - I was showing a PR6.
4) More links removed - now down to 1 link.
5) More search results dropped. As if I am a PR4
site or similar.
6) Sub pages dropping from Google Cache like flies.
As if they are a PR2 - not the PR4 it now shows. I
am betting my PR will drop to a PR4 on my home page
7) If you do a search for sites that contains a link
to my URL - www.myurl.com - it shows 2900 pages.
8) No spider activity since the start of Austin.
9) Noticing spider activity a week ago - mostly PR4
or PR5 sites.
10) Noticing spider activity Sunday night - mostly
PR5 or PR6 sites.
All this said - it seems to me that
1) Google is trying to re-do Pagerank - and to do
that - some top sites are being brought down to 0, so
no pagerank is passed on.
2) That when the next Link count / spider / crawl occurs - the old link exchanges with enough other PR
applied to them - should show back up again. (A guess
on my part)
3) Based on that your high PR sites will be recalcualted to a new PR - probably less than before.
4) Your dropped cache pages will then be re-spidered
Leaving you - hopefully - more or less like before.
This is PURE THEORY on my part.
Any thoughts and ideas would be appreciated.
Perhaps GoogleGuy could confirm this, or at least deny it if this is not the truth.
I don't think Google would do something like this deliberately, althought it probably would not surprise me!
I just can't see the logic behind wiping out a fair portion of their index, the end result would be a less concise collection of web pages than their competitiors. Seems a bit like censorship to me.. and although Google would have the right to do that, it certainly wouldn't help their image.. My attitude would be "Hey buddy, you do your job and index the pages, I'll decide what I wanna click on or visit!"
How are you guys determining the pages lost, just by doing a site: search. Do the number of pages returned reflect what *should be* indexed or do they indicate the total previously indexed minus the ones removed?
In the Google search box, type in
It will show you all of your pages.
If you see this happening - some will
have *no* cache information.
Some will show *supplemental" cache
information. (Not sure what this is)
Some will show regular cache info.
Hope this helps.
I doubt very much that Googleguy will come anywhere near this thread; it isn't in Google's interest to get involved in discussions about this less pleasant side of running an engine.
Google is only doing exactly what I would do to solve the problem. There are a few issues that they have successfully dealt with:
1. They need a solution to undesirable sites in the index.
2. They don't want to be accused of attacking webmasters.
3. The algorithms probably aren't sufficient to deal with the issues, so they have to make certain decisions through editors.
4. They don't want to be seen to be using manual deletions.
This new penalty solves all the issues in an elegant manner. The offending site is still there, but with just a single page in the index. Webmasters can't really argue with that.
| This 54 message thread spans 2 pages: 54 (  2 ) > > |