|is there enough richness of vocabulary on the page or site to substantiate that the page (or site) is relevant for the phrase? |
So are we back to taking the top 1000 results and then reparsing them using each other as what's now deemed "relevant" for a particular keyword combo and then displaying them in that new order?
Not at all, I wouldn't dream of it.
Looks like we're going backwards again...
It's not about overuse of a single phrase.
Having that single phrase appear in text, title, or anchor once, along with several related phrases that Google thinks are spammy, and not supported enough with trusted inbounds, is the problem.
I'm not sure where I read this or who posted this but if it's true, it is interesting:
"Matt Cutts on his blog looked at a couple of sites where people complained about dropping (thinking it was related to the Googlebombing fix) and he found common problem
with all of them. Pages that were incomplete (not having promised information indicated by anchor text in link to page), large amounts of copied text found on lots of other sites, etc. In general, common issues of quality."
FWIW our keyword density is low (3-7% depending on the page) and our content is written by us and unique to our site... minus the occasional scraper.
We aren't going backwards any more than we are shooting in the dark. We have a lot better idea of what is going on then when this topic started 9 threads ago.
Read the patent and the earlier threads. It's a lot to read through but if you do you will find some information that while not solving this will give you an idea of what you might try.
The phrase based patents that are likely affecting this are linked at
Thanks Annej. I probably wasn't clear. I didn't mean that progress hasn't being made. I've benefited greatly from everyones posts here and have a fair idea of what's going on.
I meant that some keep tossing out theories that have been hashed over already. The massive length of the threads means people can't read through them all, so keep coming up with the same ideas.
For a long time I did not believe in this idea. And then I got hit and I think that the reason why I was down-graded is worth being shared.
First of all the question is posed (or I asked myself the question "IF WHAT I HAD PLACED ON-LINE WAS SPAM") and the answer from MSN, and local engines and Yahoo is no. I am number 1 on MSN, and number 5 on Yahoo for a MAJOR city map search.
Hence why did I vanish from Google?
I went on to think about the URL itself. What had there been on this site BEFORE the page went up? Were there pages removed? Even robots.txt pages removed? No. So then I realized why: I HAD CHANGED THE TITLE OF THE SITE 20 TIMES IN 5 DAYS.
Now I have to sit back and wait 8 months to be re-listed.
Are you sure it's the -950 penalty? What you did seems like a way to get a much different type of penalty.
I wouldn't trust Google technology a bit. It just started to crawl it's own internal search results on our site like mad.
I have entered it now in robots.txt? :\ Maybe I shouldn't content rich varying ... with on theme external links ;)
|I went on to think about the URL itself. What had there been on this site BEFORE the page went up? Were there pages removed? Even robots.txt pages removed? No. So then I realized why: I HAD CHANGED THE TITLE OF THE SITE 20 TIMES IN 5 DAYS. |
Now, that is what I'd call over-optimization taken to the point of hilarity.
|Now, that is what I'd call over-optimization taken to the point of hilarity. |
But given the robot frequency and tha data analysis frequency maybe Google got just 5 changes, which is probably already enough for them.
My site is back in it's original position, yay! For awhile I was resigned to the idea that whatever keyword filter change Google had made was permanent and that the site might not come back from cyberia. The traffic spike is nice to see once again.
A few authority sites that were 950+ are in for a few keywords i notice, this has to be a good thing - but still alot of sites held back.
I don't know if any of you guys have ever seen a live music company recording session, but in the studio there's a "console" with many rows and columns of buttons, or knobs, that the sound engineer can adjust up or down for different elements of the sound going into the recordings.
I've always thought of search engine algos that way, and there's a video out there of a song by Eddie Murpy called "Party all the Time" where the record producer (the guy with the "hair") reaches over and turns a knob, which is a perfect illustration of how I picture a search engine adjusts elements in their algorithms - which is why I recall the video so vividly.
There have been many Google updates where things seemed to back off afterwards - just like the knobs were turned up or down a bit.
To make massive changes is, IMHO, silly because algos are constantly being tweaked and changed - just like the knobs being moved up or down.
Omigosh, WOOOOHOOO! Does Google rock or what? You can see it for yourself right here at Google search! [google.com]
It's only a few seconds, watch for the guy turning the knob. That's why we can't turn ballistic and start making massive changes. What if they turn the knob back?
[edited by: Marcia at 2:15 pm (utc) on May 17, 2007]
|some keep tossing out theories that have been hashed over already |
I'll admit, I've been tossing out things from earlier in the 950 threads because new people have been comming into this thread.
Marcia, you described it exactly. Turning knobs is just what I've been imagining.
Time for me to take a break from this all unless something really new comes along.
Marcia's control console concept (hereinafter, i suggest, referred to as the MCCC) illustrates what a whacky business we have chosen to pursue.
We're trying to figure out how the music will sound, or why it sounds like it does.
But we don't know how many dials there are.
Or what each -or any dial- really does.
Or what settings the dials are at today; and where they might be tomorrow.
Sorry for the digression. The MCCC just got to me.
Well I guess there is a seasonal dial, one for getting rid of the adwords ads and so on. Some might be automatic. If the bounce rate [and whatever they use to determine user satisfaction] on page 1 and 2 rises, reshuffle.
Every update has to at least match the income of the previous setting or improve on it. So maybe they go for a medium bounce rate so that people click on the ads. Given the man power and the financial you can, I guess, build a nice system that goes for maximum checks if you have a local maximum and so on (a maximum can be local so occasionaly you want to try totally new values to check for a local maximum). Link that to news and adapt accodingly what data set algo variation will be used.
|Marcia's control console concept (hereinafter, i suggest, referred to as the MCCC) illustrates what a whacky business we have chosen to pursue. |
The "dials" analogy is like the account in Genesis of the world being created in seven days. It's a simple story that's easy to understand, but that doesn't mean it should be taken literally.
The math for some of the "dials or knobs" is visible in some Google patents. For example, from the recent spam detection patent [webmasterworld.com]:
|This is in the section about "detecting good phrases" |
 In one embodiment, the information gain threshold is 1.5, but is preferably between 1.1 and 1.7. Raising the threshold over 1.0 serves to reduce the possibility that two otherwise unrelated phrases co-occur more than randomly predicted.
This is in the section about how spam documents get penalized
...The search system 120 retrieves some set of results, say a 1000 documents, each of which is identified by its document ID, and has an associated relevance score. For each document in the search result set, the search system 120 looks up the document ID in the SPAM_TABLE (however constructed), to determine if the document is included therein.
 If the document is included in the SPAM_TABLE, then the document's relevance score is down weighted by predetermined factor. For example, the relevance score can be divided by factor (e.g., 5).
Ted, does that patent include a weighting factor to help the search engineers meet their AdWords revenue targets? :-)
>>Ted, does that patent include a weighting factor to help the search engineers meet their AdWords revenue targets? :-)
EFV, maybe there's a weighting factor that's tied into their payroll program, in which the engineer who thinks up the best way to algorithmically divert search traffic to Adwords instead of organic results by devaluing more relevant, valuable, organically ranking sites, gets extra bonuses or perks. Maybe something like Lakers season tickets for monthly winners.
I ca NOT believe that there will be or is a filter that that filter a site out of the rankings be cause you use other words for what is on your site, thats what we have learned in school and what we see in books that author use different words with the same meaning so its not repeated all the time.
Google is just messing with you all. I have lots of sites that use the same tactics but do not get banned. One has a filter/penalty.
I have moved on and am not thinking about the ban that much. Get a new domain and start working. That is more sure than wondering if your site will ever come back.
There are search users as there are Adwords users. Both need to be kept reasonably happy in some way. If an adwords user can't get rid of their ads or gets them displayed on some bogus page they might not be happy. There is obvious spam and there is spam depending on opinion. Additionally mediocre content is the best for ads, as in being about the theme but NOT satisfying the users need, making her/him more likely to search for alternative answers. That doesn't mean an excellent review can't be a good place too. It really depends. In my opinion a traffic interest ad is best on mediocre content, while a buying decision is best on an excllent review.
So would G not built something in like that to make adwords users also happy? One needs to get the balance right. And one could possibly construe an argument that sending a user to a meaningful and fitting ad is fulfilling also some sort of ethical guidelines. Someone typing in "buy widget" has stated an intent or interest to buy. So why in God's name should you not send her/him to a page that includes what they are lookig for? How is that so different from putting an ad on their search page trying to get the user on tier one. If you can't get him on tier one go to to tier two a reasonable quality website containg adsense and a review of some sort.
Manipulation is a general marketing thing we have in all supermarkets, being it lost leaders, how you are lead through a mall a supermarket and so on. Being manipulated is normal. Part of it is of course not admitting it. Tell your users that you are manipulating them is a bad move marketing wise.
I don't expect Google to state anything else. More annoying is when you are confronted with non Google employees falling for it.
In a growing market there is obviously some room for going against the flow of conventional economics, but given the size and success of Google, I really doubt they somehow live forever in that ivory tower.
But given non manipulation of search results consider this: Wikiclones are supposedly a no no. Do I go to my internet explorer and type in a word, i get a definition _HARD LINK_ on Google to a publicly listed wikiclone with Google ads.
IMO you are led not on a Google website but to a second tier website with adsense on it.
So here we have proof of:
1.) Different quality guidelines depending on size of pocket
2.) Users are lead intentionally to a second tier website with Google ads.
Google can do what it likes, but if independent webmasters fall for the marketing spin :/, I dunno.
I have also nothing against wikiclones as long as they are decent. Afterall competing with WP means competing with a workforce that actually pays to work. So hooray to everyone fighting inflation of what you get for work. Or this new species thing they put up with tax payers money. As soon as a uni that works in cameras/travel and so on has the same smashing idea, I guess some people here might understand.
[edited by: mattg3 at 11:59 am (utc) on May 18, 2007]
Dials have more a feeling of human intervention. It's more like a more complex air condition that tries to self adjust to a (or several) target values. Automation is the Google way.
|So here we have proof of: |
1.) Different quality guidelines depending on size of pocket
2.) Users are lead intentionally to a second tier website with Google ads.
Dang, so that's why I see so many Wikipedia articles in Google search results--so I'll click on AdWords when I go to Wikipedia. :-)
|Get a new domain and start working. |
Depending on the kind of site it's not always possible to just start a new site.
It takes me a few days to write a well researched article on my topic. If I present a widgeting pattern it takes several days to design the pattern, make the item, write a clear illustrated description on how to make it and then write an article on the history of the item. Every single page on my site is valuable to me. I can't just start new sites.
Mine is related to a hobby but I imagine the same is true of in many other area. For example a truly quality travel site, the kind where the writer actually visits and researches the area, gets information that is valuable to visitors then takes the time to write a good article about it will be in the same situation.
I'm sure there are many other examples. Add to that other sites have linked to sites like this over the years losing, even a few pages is quite a loss.
|Dang, so that's why I see so many Wikipedia articles in Google search results--so I'll click on AdWords when I go to Wikipedia. |
There is a hardlink even you can't discuss away. :) Nice try though. ;)
wow, i've read through this entire thread and its given me a headache. But, some really interesting views and findings: I wonder if i could pursue a just a few, if anyone is still willing?
"...Sometimes it's really hard to see what they might be considering borderline over optimization. For example earlier a section on my site on widgeting patterns was 950ed. I had all the widgeting pattern pages linked to each other in the navigation. My thinking was thatif a visitor was interested in one widgeting pattern she would likely be interested in another. In hopes it would help get my 950ed pages back I took that part of the navigation out leaving just the link to the contents page for widgeting patterns. That seemed to have solved the problem as the pages did come back."
....but how about main site menu links - pretty much they will often end up in a closely coupled mesh as the example above - except possibly in the choice of anchor text and keyword repetition: could it be that there is (as patent suggests) information gain being quantified for the page, base on linked phrases : in annejs example a sum zero - ie. google cant make any taxonomical or semantic sense of the resulting (semantic) mesh. it might be interesting to try other link network structures (eg. tree, steps) and see if this gives an opposite result.
"Another factor that I've been noticing on a number of -950 pages is the lack of a consistent menu block across the site. If the main navigational links are set into a clearly marked area -- some container such as a table cell, div or list, then the overall algo can pick out the navigation and not use those blocks in the pure phrase assessment of the individual document."
this would seem to make a lot of sense: google must have a way of removing the duplicate entries necessarily caused by having a consistent menu block - most sites use them, and dont spawn duplicate entries as a result (although simple spiders do compound such link clusters - ie. a search result for menu items on every page crawled) .
although on our own site we have barely started any sem/seo activity (non-cynically of course!), we have a consistent top menu structure at top, plus in page anchor text to specific terms, plus at foot we have a 'rolling walk though' sequence of links - the phrase window 'naturally' steps on one link or so at each page in sequence within particular user walk-throughs - with a few t-junctions, cross roads, etc as and when such choice might benefit the user. the phrases and words used in bottom links are generally related by (stem)-page-topic.
we dont seem to be doing too badly at present in SERP's and PR's (on many key words and phrases), although in highly competitive fields (eg. 'web design', we hardly figure in SERP's - age and authority for those given key phrases tend to beat us, (though in truth we havent really tried to compete here yet.). logically my next step perhaps should be to set up a few 'quasi-identical' test sites and perform some 'elimination' trials on stuff like this.
It would seem to make sense that if certain site structure are penalised in the way alluded to in the forgoing posts, that google might at least provide us on some definitive structural guidelines on the 'flaws' in their system, to this level of detail.
"Totally changed navigation for all the sections and pulled out internal footer links.
It didn't go +950, it just went PR0. Two older sites, same thing, as a matter of fact."
rc:: could you tell me what you mean by the acronym PRO (ranking?)?
"....with on theme external links ;)"
rc:: from what i read of the patents - that sounds a promising move, interested to know how it fairs - can you isolate the effects sufficiently to determine its efficacy?
"1.) Different quality guidelines depending on size of pocket
2.) Users are lead intentionally to a second tier website with Google ads."
absolutely rational - up to the point where anti-competition an pro-consumer law suites start hurting their bottom line from the other direction - they need to be continuously diligent in 'declaring' their own interests and those of their subscribers - serving the 'masses' is a necessary loss leader and the only thing that prevents their prime slot being taken over my one of the other lesser 'big' contenders.
the strategy i've chosen for IBL's is to study the linking strategy of my greatest SERP's competitors; i try and use only 'legit' business directories that they use - i also place link clusters in a 'page hierachy of quality/authoritivness'. so far i only/mainly only maintain reciprocal links, with directories, forums and customers (though oftimes links are once removed via eg. technical specialisation entities we own/share interests with).
overall our strategy isnt directly seo oriented - but we do want our prospective customers to find 'us' and 'our flying widgets' rather than our 'known competitors widgets'. our original site was launched in around 1995 and was 'a bit crap' - we relaunched, permanently redirected old pages and submitted google xml around december 2006. we have along way to go in getting some of our core products up in SERP's, whilst others (some of our own and notably some where we have joint ventures/campaigns with partner companies - seeming to get off to a good start - though very early days.
this whole debate reminds me very much of the two major schools of approach to stock investment forecasting: there are the 'numericists' or 'oscillator geeks' (read SE algorithm reverse engineers) Vs the 'fundamentals' traders: Google very much wants to leave us only the second strategy as viable leaving itself in possession of 'sole secret knowledge' of what must ultimately become a non-linear, non-deterministic PR function (read information asymmetry). In practice however, we are left squabling amongst ourselves over the marginal remnants of true PR as influenced by structured (deterministic) content/context.
...and along the lines of a 'discounted market', anytime we get a little too close to describing whats actually happening, then so eventually does the rest of the world; thus google are duty bound to change it all once again: it may get worse before it gets better again, but necessary information asymmetry is restored: a perfect 'moderated' market.
Does that make any sense?
So mattg3, how about telling us how the ADF** works. What factors are they looking for to pick up algorithmically to move out of the way?
**ADF = Adwords Diversion Filter
| This 195 message thread spans 7 pages: < < 195 ( 1 2 3  5 6 7 ) > > |