| This 201 message thread spans 7 pages: < < 201 ( 1 2 3 4 5  7 ) > > || |
|eval.google.com - Google's Secret Evaluation Lab..|
"Rater Hub Google" Rumours?
On Apr 19, 2003, some members had spotted referrals from
followed by a question number and couple of different email addresses. You can read earlier threads here,
According to some [slashdot.org] sites [searchbistro.com],
|It's one of the best kept secrets of Google. It's a mystery on Webmasterworld. Also in Europe (France) they don't know what to expect from that odd URL [eval.google.com....] Click it and you get ...nothing. The site reveals itself only if you have the proper login and if you use a network known by Google. Residues of Eval.google are found on the web, but the full content of the mystery site has never been published before. Here it is: the real story about Eval.Google. They use... humans! |
The site claims it is some kind of the secret google evaluation lab!
there are a lot of webmasters from your country that run superspam pages with hidden text and all black seo tec and rank top 10 in google,what is your comment about that?
Hate that. We expose them every month. Here is an <DELETED> of one of this deceiving firms. It's in Dutch, but you will get the picture.
[edited by: engine at 2:25 pm (utc) on June 7, 2005]
[edit reason] TOS [webmasterworld.com] 28 [/edit]
Perhaps this document explains the odd result I found a month back. The site I noticed that has k1 k2 k3 k4 k5 k6 k7 on nearly every page in the comment field, yet when you search for k1 k2 k3 k4 k5 k6 k7 you don't find the site, only sites that mention it, link to it, scrape from it, affiliate to it.
I assumed it was some sort of duplication/mathematical filter because it was not a search result anyone would consciously choose. MSN & Yahoo both correctly placed it top for that phrase.
Perhaps it was evaluated to have k1 k2 k3 k4 k5 k6 k7 as hidden text (in the comments field of the page) and so was penalised for phrases based on how close to k1 k2 k3 k4 ... the phrase is. Another possible answer for that oddity.
|Perhaps it was evaluated to have k1 k2 k3 k4 k5 k6 k7 as hidden text (in the comments field of the page) and so was penalised for phrases based on how close to k1 k2 k3 k4 ... the phrase is. |
I can't say for certain, but I don't think it's a penalty, or that it is considered hidden text
If I was writing a search engine, I would ignore comment fields, just like a browser would.
|I would ignore comment fields, just like a browser would |
Apologies, I meant the description metatag.
I don't get some of you. It's an algorithm. Sure it is. But that algorithm is made by somebody. That somebody can make many slightly different ones or make algorithims that have parameters that need to be determined before you can use it.
Somehow a choice has to be made as to which algorithim is going to be the next on the production sites. Somehow this needs to be evaluated. I'd be more than happy to see humans doing (part of) that evaluation.
No big news, and actually just a confirmation they do the right thing.
As for publishing copyrighted stuff without permission: all of us webmasters should be 100% against it in all cases. Journalist or not doesn't matter. Not that I think bloggers are journalists by default to start with. If bloggers think they are exempt what differs them from scraper sites? It's just a copyright infringement. The goal does not justify the means.
|Got a load of unpublished documents with (C) in it. The one I published hadn't any (C) in it. Perhaps the agents removed them before they send them to me. |
Copyright is automatic and complete: you do not need to claim it to get it an dit protects all rights. You need to reduce your rights as an author in order to allow people to use the document for the purpose it was intended.
|As for publishing copyrighted stuff without permission: all of us webmasters should be 100% against it in all cases. |
Google takes a copy of every site without permission.
Are you for or against that copyright infringement?
Whistle blower in evil company reveals secret evil plans that are copyrighted.
For or against?
The copyright question is more complicated than your response makes out.
Copyright was also a side issue here, if the reporter rewrote it so that the meaning was the same but the words different it would not be a copyright infringement because it would not be a copy. But Google guy would still be unhappy.
Did you read it? Was it of public interest?
|It says: "As you see, pages with the same content may be assigned vastly different ratings based on the absence or presence of a ppc program." |
Need I say more?
Yes, you do need to say more. You are again omitting the context of the sentence to make it fit your pet theory. Here is the entire section:
|Google does not encourage creation of duplicates, so we are asking you to mark such result Offensive. Of course, had the result been a page on the Open Directory itself, it would have to be rated on the merits to the query.3 As you see, pages with the same content may be assigned vastly different ratings based on the absence or presence of a ppc program. |
The issue in the cited example above is DUPLICATE CONTENT, and NOT the PPC itself.
Mikenolastname, you are clearly twisting the document to make it fit your personal theory. You cannot pull one sentence out of context and use that to retroactively fit your theory. The statement has to be seen within the full context from which it originates.
The original context of that second statement you quoted has to do with duplicate content, which speaks to the heart of what I noted in my first post, which is that Google is judging spamminess according to how useful the content is. Duplicate content is not useful.
The quote you pulled does not support your contention but reinforces the points I raised in my original post, which follows below.
|mikenolastname said: This other factor was pleasantly provided by the eval doc which says (read it closely) spamminess is determined by a number of factors, including... if all else is present is the obvious presence of PPC's! Adsense is considered a PPC, no? |
You are grossly misreading that document and twisting it to fit your personal theory. Here is why your theory is incorrect.
|This is what the document actually says: |
Secondary Search Results / PPC
We want to mark as Offensive the pages that are set up for the purposes of collecting pay-per-click revenue without providing much content of their own. You will see such cases most frequently in conjunction with “search results” feeds.
It's not the fact that the page is showing PPC, it's the fact that they are showing the PPC ads within the context of secondary search results with NO REAL CONTENT apart from those feeds.
The defining characteristic of spam, according to the document, is that it has no real content.
Clearly it is not the fact of having AdSense in itself that would trigger an evaluator to flag a site, but the fact that there is no real content beyond advertising.
This fits perfectly within Google's historical goal of showing sites with content that is useful and what that document shows represents no change in policy at Google. In no way does it demonstrate that a site showing AdSense will be demoted, penalized, tagged as spam, or whatever you want to call it.
[edited by: martinibuster at 4:40 pm (utc) on June 7, 2005]
Our problems are very consistent with MikeNo's theory and he correctly notes that the document states that adsense could easily tip a site from "good" to "offensive".
Obviously they'll make the computerized algo consistent with the documents stated objectives - even if the info is not "directly" used to rank sites. To Google's credit the (public) webmaster guidelines are very consistent with the (formerly secret) ranking guidelines.
I think GG implied earlier that these ratings are used to tweak the algo rather than adjust rankings directly. In addition to the WebmasterWorld community knowing that he's an honest fellow this is all consistent with the problems we still see - cleverly automated spammy sites often rise above legitimate ones, and even a site cited in the guide as a "fine example" of a good quality site suffered greatly in a recent update (that site has now recovered).
I WISH they used more input from the people rankings instead of automated ones, which are easier to manipulate.
|and he correctly notes that the document states that adsense could easily tip a site from "good" to "offensive". |
No, no, no. Read the document closely. That section is in reference to duplicate content pages in the context of secondary search result pages. The statements you are referring to are in reference to secondary search result pages with duplicate content. In no way is it relevant to normal pages. The whole section is about Secondary Search Results.
The document notes that a duplicate content page (and that section is about duplicate content within the context of secondary search results) will be deemed offensive by the ppc component. That section is not talking about original content pages. It is dealing exclusively with duplicate content, i.e. poor quality pages.
If you read the document it mentions EFV's site as having a strong commercial component, but the quality of the content is what matters. The presence of AdSense made no difference. It was not a "tipping point" as you say. It's all about the content.
The focus of the document, especially in the sections that Mikenoname is referencing, is the quality of the content itself.
It's like a woman who falls in love with a man because he has a wonderful personality, is witty and humorous, and it happens that he has wavy black hair.
Mikenoname's theory seizes on the wavy black hair as the reason the woman falls in love, which is just plain silly.
[edited by: martinibuster at 4:59 pm (utc) on June 7, 2005]
"secondary search result" is a specific type of result outside of normal SERPs?
I assumed Mike is looking at the top line "... without providing much content of their own" (very subjective). Rather than below examples of "duplicate" content.
It's an important distinction because people (often violently) disagree about what constitutes duplication / plagiarism / originality / spam.
I assume our site may be a victim of this subjective evaluation problem since we are a combination of original writing, general databases, and IAN affiliated hotel pages. In our case it seems adsense could have been the "tipping point" as Mike suggests.
(edited for general confusion and insanity purposes)
|even a site cited in the guide as a "fine example" of a good quality site suffered greatly in a recent update (that site has now recovered). |
Actually if you've been following along - that site suffered a 75% loss in G Traffic in late March.
The fact that his G traffic was reduced even further at the beginning of the Bourbon update is likely related to whatever caused the initial problems.
Since EFV is cited as an example of a quality site with affiliate links, it is unlikely that they would turn around and decide to mark it an "offensive" result.
The theory that the site owner expressed about his ranking troubles, if I recall correctly, was something to do with duplicate crawling of the WWW and non-WWW domains, a simple enough fix which he has since implemented.
The fact that his site recovered during "phase 2" of the Bourbon update without making insane changes (removing Adsense, for example) pretty much proves that.
|but I think Mike's looking at the top line "... without providing much content of their own" |
Unfortunately Mike is not looking to that part. He's focused on the ppc aspect to the exclusion of everything else around it.
joeduck, you raise a good point about the subjectiveness of what constitutes good content. Does bad grammar or misspelling constitute that? These evaluators, many were college students. They have many cultural differences from a little old lady in North Carolina writing content about things to do on a Saturday night. Will the evaluators, based upon their cultural differences, consistently score her content with low marks?
Nevertheless, the evaluator thing was a passive thing for judging algos, and not for slamming specific websites, and absolutely not for penalizing specific web pages with sniper-like accuracy (the hole from which that theory was pulled out of the eval document is anybody's guess).
In other words, whether it's subjective doesn't have a bearing on the ranking of a site, so it's a moot point.
[edited by: martinibuster at 5:13 pm (utc) on June 7, 2005]
I agree with Martini in what the guide is saying, but the problem is that it really ends up being up to the rater and his/her interpretation of the guide.
We know what search feeds are as we see that all the time (and possibly use them ) so we understand what the guide is talking about, but can we be certain that the rater is going to have the same knowledge? Will they understand the difference between a page utilizing Adsense for additional revenue and a search feed site?
I'm playing a bit of devil's advocate here as I think the difference is generally obvious, but it's something to think about.
> that it really ends up being up to the rater and his/her interpretation of the guide
But that's the trick - it is never up to *one* rater. As you can see on the "deep throat" site, there are multiple raters for SERPs - and that they can even see what another rater thought about the site.
This is helpful in two ways. If you see adsense and immediately rate the site as "offensive", and 5 other raters have voted "useful", (I assume, from the graph) you could return to re-review the site/serp and make a revised decision. Also, I am sure that in borderline cases, seeing 4 useful votes or whatever is probably going to make someone lean towards voting the same way, unless they spot some unsavory aspect that no one else noticed.
To be frank, I think that the Google spam guidelines are pretty fair and open minded - I wish that DMOZ's guidelines were as fair :p
Although the section is entitled Secondary Search Results/PPC, half of that title is just "PPC". The first sentence states quite generally that it applies to "...[ALL] pages set up for the purposes of collecting pay-per-click revenue without providing much content of their own...". I have NO idea where Martini got the idea I was EXCLUDING ALL ELSE BUT PPC. Look at EVERY SINGLE ONE of my prior posts which EMPHASIZE that there ARE OTHER FACTORS BESIDES JUST PPC.
In this case SSR feeds are only ONE specific, easily illustratable, example they are using. It does not say it is limited only to directory sites. In fact the second paragraph specifically INCLUDES "[ANY] copied content from a legitimate credible source." This leaves it WIDE open to pages simply not providing MUCH content of their OWN [at the disgression of the rater]. It leaves all others open to the subjective interpretation of the rater. That is also consistent with why at the end of the forth paragraph it GENERALIZES again and says "As you see, pages with the same content..." instead of "As you see, duplicate pages with the same content..." or "As you see, SSR pages with the same content...".
My ORIGINAL assumption WAS THE SUBJECTIVITY of the rater to determine what was considered as a page set up for the "purposes of collecting PPC". If you notice in one of my prior posts I mentioned that in OUR CASE the rater apparently MISTOOK our content as either copied since it does list a number of links with short descriptions, or copied because it has been copied by scrapers. Or, to add, perhaps because in many cases the ad text PROVIDED by advertisers does reflect their website content, or perhaps he thought our internal links to a separate domain (of our own) were some affiliate or PPC, or perhaps the ill-trained employee just so happened to be brain-dead, or who's to say an employee didn't understand the context of the doc as well as some of the more experienced folks on this thread? For instance look at the wide array of options they present under the heading: "What you'll see on the result page" (notice here "result" is being used in most other sub-sections as well and not to refer specifically to SSR pages), it says "Or the page may look like the top-level page of a legitimate directory (tree structure)...". What is a "legitimate directory"? Especially to an inexperienced rater? That covers at least a FEW pages on just about anyone's site right there! Does it appear they present the rater with more than a few pages from each site? What makes you think the student/employee had a clue what directories they were referring to? The doc doesn't list what legitimate directories look like. The entire last three pargraphs of the section talks strictly about PPC and affiliate inclusion, so I would say the focus is more on PPC/Affiliate inclusion than SSRs.
Another interesting thing I noticed: The Vital/Useful ratings. Wonder how you get/lose those?
But, you know folks, I'm tired of this unproductive bickering. I have better things to do. I've shared what I think I "discovered" and connected. I don't expect everyone to discard their long-entrenched beliefs and conceptions. If you're not satisfied with my observations and proposed explanation, please feel free to go and find your own connections, your own commonalities. I haven't seen any better ones proposed thus far.
> I mentioned that in OUR CASE the rater apparently MISTOOK our content
I find all this very interesting Mike.
Could you post the appropriate referrer strings from your logs that will show the movement of the rater or raters through your site?
I'd be interested in seeing things like how long he/she/they spent on your site, how many raters came, over what time period (ie within minutes / hours / days), and any other info from your logs that you feel is pertinent.
Just had a nice slow read of the (stolen) Google documents, and I can't say that I found anything surprising. I can't say that the existence and purpose of eval.google.com is particularly surprising either.
The only other way G could evaluate one possible algo vs. another would be to put it into live search results. I thinked they learned about how dubious that idea is, back in November of 2003.
Is the sky falling again? Wake me up when the sun still rises tomorrow...
What these docs seem to be spelling out are the apparent weaknesses (at least at the time of publication - and most likely still, in large part) of the algo(s).
I don't get the impression of direct human intervention being carried out, rather, either the delegation and investigation of submitted spam reports to paid staff, a "before and after" snapshot "between dances", or the flat "admission-by-inference" that automation alone cannot detect these techniques (yet!).
I mean, everything reported - apart from perhaps "white-on-white" text which is very highly probable to be spam - have (from memory) grey areas attached which automation alone cannot distinguish - Hidden text for instance; spam, or a means to save the speed-reader time? a'la MSDN which commonly use hidden text for hiding code examples?
[I just thought I'd drop that in 'cos as of page 5 everyone seemed more concerned with the rights & wrongs of human tweaking - Apologies if this has been raised in the interim but I guess not many will be reading - what's this? page 9 now? - this anyway!]
I agree whippin - I accept GG's word that it's an indirect rating process, but clearly they see it as needed because the algo sometimes (often?) fails to ID sites properly. I don't fault Google for sometimes failing but I do fault them for having poor/slow/no remedies for unfairly punished sites.
As the 800 pound gorilla of search Google has an obligation to communicate effectively with the web community. I fear they are failing to do that, esp. with respect to severely downgraded or unfairly punished sites.
|As the 800 pound gorilla of search Google has an obligation to communicate effectively with the web community. I fear they are failing to do that, esp. with respect to severely downgraded or unfairly punished sites. |
I'm having a hard time agreeing with that statement. In fact, I'm having a hard time believing anyone would even make that statement (unless the motivation is to spur a personal review).
In light of GG's extremely involved participation here with the Bourbon update, in which he answered many questions, including where you can send your feedback to, it's quite unfair to point a finger and say it's not enough.
You might not have received personal attention but Google has authorized the most extensive webmaster outreach of all the search engines.
Dude! Standing by the statement. I agree that Googleguy's openness and participation is a great resource and he appears to be an excellent fellow as well.
Certainly I could be wrong about Google's non-responsiveness though you apparently have had more informative support emails than I have, and you are interpreting the current crop of search results more favorably than I and many others in WebmasterWorld.
I'd need more insider info to know if the support emails reflect a serious attempt to answer an ocean of irrelevant or complex questions or are just damage control for increasingly problematic search results.
What's your experience with G support replies? I find Google support spookily like MSN - cryptic replies that often just state the obvious.
Well said Martini.
Google does not have to make an effort to work with webmasters.
Yet - they participate in forums, have detailed webmaster pages, have an accessible "AdWords Professional" program, etc. When is the last time you saw “Overture Guy” participating in an online forum?
Do they do this because they are good people? Probably not. I'm sure they are good people, but they are also smart business people.
They recognize the long-term benefit of collaborating with online professionals, rather than putting up a brick wall. They know that we’re the middlemen between advertisers and their network, and without us, it would be more difficult to educate people about their offerings, and assist them in taking advantage of them.
I see a lot of emotion on this board, and a lot of expectations of superior customer service from Google to support webmasters on an individual basis. That is just unrealistic.
Those who have been in this game for a long time understand and appreciate Google's efforts. Sure, it’s OK to push and prod them, especially if you think they have been dishonest or have misled people, but despite all the strong claims, I just have not seen ANY evidence that Google has done anything dishonest.
Speculation, even reasonable speculation, is a far cry from “facts.”
>>>though you apparently have had more informative support emails than I have
Maybe, maybe not. But if you read past update threads it's clear that Google looks at the solicited emails that come in with specific subject headers per GG's request. It's update feedback they want to hear, the results of which apply across the range of serps.
It is widely known that GG has stickied with many members in the past who have seen anomalies in the serps in order to identify collateral damage or a bad tweak. It's out in the open, everybody knows this.
As far as sending unsolicited emails to Google about your individual website, I think that's a little unrealistic about the resources needed to do that versus any positive results for Google serps that are going to result from that.
Think about it for a second. Look at how bad the dmoz situation is in terms of editing mounds of garbage that people keep submitting (evidently in good faith belief that it belongs in dmoz). And dmoz has tens of thousands of editors and they still can't keep up.
Can you imagine how many people must email Google with their sites asking why they MAY be penalized? How many people do you estimate are simply imagining they are penalized when in fact they are suffering from a bad case of bad-ranking-itis?
It's unreasonable, in my opinion, to expect Google to have hundreds of people doing site reviews for every webmaster who thinks they aren't ranking where they should be. The reason is because there is no positive systemwide effect on the algo to do that. If dmoz, with it's tens of thousands of volunteer editors, are groaning under the weight of their workload, do you really think Google has the extra cash to pay tens of thousands to do site reviews every day for bad ranking webmasters?
>>> and you are interpreting the current crop of search results more favorably than I and many others in WebmasterWorld.
I know you're going to think I'm being flippant, but I'm not- this is the way I feel: You win some, you lose some. I win some battles against Google and I lose some. I don't believe I have ever met a webmaster that clawed their way to the top consistently and held onto all their spots for all time.
"I find Google support spookily like MSN - cryptic replies"
I don't agree to that,in fact I admire google support,though during this update my site have lost rankings from page 1 to page 30 ,and from 9000 day pageviews to 2000.I have writen to google and I have got a very detailed answer ,Google support made me to understand what was the problem ,I think I know what was the problem with my site,at the moment i can't do anythink to fix it,other webmasters ,members of this forum ,due crying and begging ,had there sites back,i will not do that ,I will wait until the next update,as the google support told me ,if your site has enough links point to your site in our next reindexing your site will be back,and I have over 2000 one way links from sites PR9 -PR5.I guess that post should go to burbon update forum but i guess is relevant here too.
9000 day pageviews to 2000.I mean per day.
>> The fact that his site recovered during "phase 2" of the Bourbon update without making insane changes (removing Adsense, for example) pretty much proves that.
With all respect... not algorithmic evidence. Given the vocal nature of EFV combined with the human perceived quality of the site, this could have been a hand tweak where algorithmic penalties were reset.
(Ack.. had promised myself not to get dragged into this thread.. but this statement made me do it. Sorry!)
EFV shows the way...LOL.I wonder does the guy have a lot of isiders in Gplex or a lot of money gambling in the NYSE at GOOG.
Your note is well reasoned and I don't think you are flippant about the issues relating to Google response - I might have responded to myself exactly as you did before Feb 2 and before reading the many support mails stating we have no penalty, yet clearly we have been relegated to "omitted results" for almost all relevant searches.
I also admit that *some* of this is "sour grapes" by me because our site was hit big time after Allegra (50k to 1k unique visits daily after years of high regard by Google and extensive addition/editing of information).
I'll stop whining as soon as legitimate competing sites come up ahead of us rather than those set up simply to display other's content.
Would you share the thing G support told you caused your problem?
Google does different things at certain points in the update cycle.
EFV's website got split and it takes time for that mess to get cleaned up.
Ask g1smd about split sites etc, I think that he probably had his ticker skip a few beats when he saw all those supplemental entries enter the system again.
I know at least two others that are kind of spaced out about that exact situation right now.
Now, I've had enough for one day and am invoking rule 4 early.
| This 201 message thread spans 7 pages: < < 201 ( 1 2 3 4 5  7 ) > > |