| This 39 message thread spans 2 pages: 39 (  2 ) > > || |
|Some research on Google quality raters' behavior|
So, I decided to check which pages they looked at before banning (de-indexing) one of my sites. I was hoping they've managed to find something really-really bad on my site and if I can find it by following their tracks, then remove this really-really bad thing, my consequent reconsideration requests will be more successful. One such request has already been rejected.
The tool used in the study: awk [the-art-of-web.com]. Got some great samples from that site.
Anyhow, after much awking, I worked out the code that seems to be grabbing the very logs I'm looking for: a visit from the Plex (by IP) that was an actual browser and not a bot. Also, I used superclown2's idea from here [webmasterworld.com] that the raters are coming in on Macs.
So I ran this on my August logs (both ban and recon request were last month):
awk -F\" '($6 ~ /Macintosh/)' *.com | awk '($1 ~ /^70\.90\.219|^70\.89\.39|^70\.32\.|^64\.233\.|216\.239\.|209\.85\.|199\.87\.|173\.194\.|^74\.125\.|^72\.14\.|^66\.249\.|^66\.102\./)' > ~/google_visits_on_Mac_most_IPs.txt
(it should be one-line command. Run it in your ~/access-logs directory, the resulting selection of human visits from Google will be in your home directory: ~/google_visits_on_Mac_most_IPs.txt )
So, anyhow, I was able to see a visit exactly 1 day (approx 25 hrs) before each of the events - ban and the response to my recon request. Both were from the 216.239.x.x subnet although earlier there were hits from other Google networks, too.
I was rather disappointed to see that before banning the site the rater visited a very drab and ordinary page on my site. Not a smoking gun of some incriminating evidence of a hacker break-in or some such I was looking for. Also disappointing is the fact that they visited one page only. I can't tell how long they have stayed on the page but can you make such a drastic decision about a 400,000+ pages site by looking at just one of those pages?
Probably even more disappointing yet is the way they treated the reconsideration request. A person came in and, indeed, only looked at a single page again. Only this time it was simply the homepage. My site is a forum, so the homepage contains pretty much only a list of the most recent threads - not much else to see there. At least the page they looked at before banning was representative of the layout (including ads layout which I hear they hate so much now). The only conclusion they could possibly have made by looking at the homepage and weighing my reconsideration request was that the site's still up. Apparently, that was enough to reject the request.
Anyway, hard data confirmed: your livelihood is in the hands of a typical overworked, disinterested American (IP geo) corporate employee. No surprise here...
Anyone want to fill in here about what raters are looking at on your site(s)?
It seems to me that something must be seriously wrong with the site if they completely removed it from the index. Are you saying that you have no idea what guidelines it violates?
After thinking about how the manual review process might work, my guess would be as follows:
Step 1. The Google algorithm detects a possible problem and refers the site for manual review, along with details about the problem.
Step 2. A low-level reviewer checks the site, determines that it probably should be penalized or banned, and passes this information to a high-lever reviewer.
Step 3. The high-level reviewer makes the final decision about what to do and then implemebts it.
Yeah, that's the thing: everything I do, I do in moderation. I was not able to find that smoking gun. I was also under impression that it has to be something seriously wrong for it to be banned. But, looking at the pages the raters visited, it may be dull and uninteresting (especially if you're not into the subject matter) but nothing is wrong that I can see.
|something must be seriously wrong with the site |
Anyhow, sorry I did not want to make this into yet another thread about my issues, there are enough of them already. One here [webmasterworld.com] and all the recent ones in my user profile, if anything strikes your interest...
Here I just wanted to see if other people can run this filter on their logs and report the finding so we can perhaps build together some idea about what is it about the site that they are normally looking for and how they go about doing it.
Google does not strike me as a company that can spare two people for what seems such a mundane task. It's devastating to me but it's not even a blip on the radar for them, so I think we can safely assume that there is no second level of review.
It may also be simply impractical - I can relate to that myself - I moderate my forums and usually there's no time (or need or even desire) to second guess the click on that "Delete" button. I think I get it correct 99.5% of the time which on my scale maybe comes to one wrongly deleted thread a month. On Google's scale though it may be a different story...
No reason to think Google allows their people time to actually analyze what they see, it has to be visceral response at that point in the process. One has to hope that the computer has done at least some analytics process for them BEFORE popping that page to be rated on their screen.
I remember spotting an AdSense manual reviewer in my logs before my first $100 payment (it wasn't difficult to track down; I just stuck a bit of PHP tracking code on my websites' legal/copyright pages, as they're seldom visited by 'normal' visitors and I assumed - correctly - that AdSense reviewers would head for the legal pages).
I didn't get banned from AdSense, so I guess I passed ;-)
(I knew I would as my sites aren't spammy and have minimal ads on them, so there was nothing anti-Google-TOS on them)
It's not great to think that Google inspections are only < 2 minutes in length (sounds like mine was longer than your webmaster inspection), but I guess the Google inspectors would have tonnes of sites to visit per day.
Suggy, you may not understand what's going on. There's plenty of people who experience greater sales on old, simple looking websites than on current designs. Try Googling "ugly sells" site:webmasterworld for threads.
Plus, I don't think sites get deindexed for style and layout. they get deindexed I beleive for using spammy techniques to get ranked.
I appreciate the succinct info put together by the OP - it's great info. And if you're still lost, I'd suggest you find someone who knows about penalties to do a hand review of your site, give you some guidance.
quality raters ban entire websites? That would be quite something. You are saying subjective opinions are now used not just in lowering rankings but totally removing your website from the index? just on whether a quality rater likes the site or not? that would be extraordinary and i cant believe thats the case. Lets not forget the Googles nature is to automate as far as possible, banning on quality would fly in the face of that. It must be assumed that this site has been banned for abuse, i dont see any mileage in giving more subjective opinions on it. Perhaps start looking at off-site factors?
when i need life saving heart bypass surgery i wont be selecting my surgeon by how nice his suits are, i will be asked how good his surgery is.
BTW, the site in his profile has not been de-indexed.
|Anyone want to fill in here about what raters are looking at on your site(s)? |
Do a search for "Google Quality Rater Guidelines" and you should find what you're looking for.
|BTW, the site in his profile has not been de-indexed. |
No, but it IS being filtered. Various online tools show a severe decline in that site's traffic. When a site is deindexed, you know you're in trouble. There is typically little to no recovery when that happens.
Wow, that's a great response I got in this thread all of a sudden. And I thought the subject matter was too specific and the tread has been abandoned :) ...
Anyhow, thanks for all your tips and critique (and I've yet to read the thread in its entirety) but the site in my profile is not banned and it is not the one I based the research on and yes, it looks antiquated because it is: I haven't worked on it in years, having been dealing with other projects.
So, please, lets not make it about the site in my profile because it has nothing to do with the subject matter.
And as I noted, you don't get penalized because your site looks like junk. Your harsh reality isn't actually reality.
And the site doesn't look like junk. It looks fine.
Just a thought...
If the google recondisderation rater did only go to the home page when looking at the site for possible reinclusion, is it possible that they left so quickly not because of something they saw, but something they didn't see?
(I haven't been to your site so I don't know anything about it - but I am just wondering if maybe a LACK of SOMETHING might be contributing to penalties nowadays.)
This forum is not intended as a place for site reviews, and the site people are talking about is not the site that the opening post reports on. I've removed most of those comments.
Also note that I've removed several posts of a personally insulting nature. Keep it professional, please.
As I understand it, there are at least two different kinds of manual inspection that a site might get. One comes from a "quality rater" - this is the big army of people who are each given a set of search results and asked to rate the quality of that SERP, site by site. According to the patent, no one quality rater ever impacts a site's rankings, it takes a solid agreement from several.
The second type of manual inspection comes from an internal engineer, checking on a site that has been flagged by the algo for some reason, or perhaps from some input sent to Google by a third party. These manual inspections may well be looking for some particular element... and they also have access to all of Google's internal tools, so what you see in your server logs does not indicate the full scope of their information. They might well just be confirming that internally Google has the most recent version of the page in their internal cache.
As mentioned above, there's also Adwords and Adsense manual inspection - but that's a whole other department.
[edited by: tedster at 5:35 pm (utc) on Sep 10, 2011]
|Google does not strike me as a company that can spare two people for what seems such a mundane task. |
It's my understanding that the human quality rater system (the big army that inspects specific SERPs) uses multiple human editors for each SERP - and therefore for each site on that SERP. It takes an aggregated agreement to establish the "editorial opinion parameter" that gets integrated with the automated algorithm's output.
The less mundane task is for those situations where either an automated "suspected spam" flag gets thrown, or there's some other human input that alerts Google to the site for some reason. I don't know if these engineers have the power, as one single person, to generate a penalty. But my feeling from watching Google and their quality control pattern, is that any opinion needs approval from a supervisor in some way.
|So, please, lets not make it about the site in my profile because it has nothing to do with the subject matter. |
Understood. Does the site in your profile give us a good indication of how ad units may be placed on the site(s) in question? Is the content on the site(s) in question yours? Or, is most of it assembled from other resources? If I were to review one of these sites, how much above the fold is devoted to ad units whether it be AdSense or some other ad network? Do you have ad units in heat zones? In primary navigation areas like the top left nav? Are they Made for AdSense (MFA) websites?
Doesn't it normal review procedure for similar services/products. If you do find something wrong, some error etc. you deny/delete it. You wouldn't go ahead to find something good to justify it, or try to check if it is done in moderation.
Apple also has the exact same review process for App Store. If they find any errors/problems - even tiny, they just reject the app and you have to re-submit all over again.
Yes indeed. The heat zones straight from the AdSense playbook. Most banned sites also had inline ads that irk plenty of people, myself included. But we are still getting back to the same argument: someone visited the site, didn't like how it looked and banned it? Or, as the pattern of visits suggests, didn't like how one page looked and banned it.
|Do you have ad units in heat zones? |
Ads on most of the affected sites are by now removed either entirely or one unit left. Inline ads are gone, too. I guess we'll see in a year or so if that helped ...
Anyhow, as tedster suggested above there are two types of review - AdSense and the search quality team's. I have to admit that I know of no way to discern between the two and therefore those visits to ordinary pages might be the AdSense people who, apparently have to qualms about my sites. The ads are still happily running wherever I still left them on. In other words, if the search quality team knows about a trouble it sees in their own, already harvested data, they are not sharing it with me and I'm not going to see another hit on it.
So, this entire exercise might only be helpful if there was a way to separate AdSense review people from the guys with the big guns.
If there's no adsense on the page there is one less set of eyes with potential to ban your site. This is a HUGE(and not often discussed) negative for adsense since no other ad alternative has such a direct link to your search rankings.
- If adsense is not making you much money on a PAGE level it needs to be removed from said page.
- If adsense is not making you much money on a PER UNIT level that unit should be dispensed.
- If adsense is not making you much money on a PER UNIT/PAGE level some units need to be removed from said pages.
- If adsense is on your home page, good luck to you.
Sounds like a lot of work but the benefits (page speed, higher CTR/eCPM on remaining units, less data tracking by Google etc) are well worth the effort imo.
It was a ebook about guidelines of Google human raters, published on Maurizio Petrone blog as .pdf (now is not available) - try searching it.
More details about content of this ebook [searchengineland.com ].
Also, you can post your web site on Google Forum to get help.
|Yes indeed. The heat zones straight from the AdSense playbook. |
Which in my opinion, and many others, may not be the right thing to do, especially if your content doesn't support that many ads in those types of positions. I mean, when I first land on your page, what do I see? Is it 75% ads and 25% navigation? And I have to scroll below the fold to find content?
I had asked...
|Does the site in your profile give us a good indication of how ad units may be placed on the site(s) in question? Is the content on the site(s) in question yours? Or, is most of it assembled from other resources? If I were to review one of these sites, how much above the fold is devoted to ad units whether it be AdSense or some other ad network? Do you have ad units in heat zones? In primary navigation areas like the top left nav? Are they Made for AdSense (MFA) websites? |
You answered one of those questions related to ad units in heat zones. What about the other questions? They are an integral part of determining whether or not you can play by the AdSense Rulebook.
|Ads on most of the affected sites are by now removed either entirely or one unit left. Inline ads are gone, too. I guess we'll see in a year or so if that helped... |
When did you remove them?
If someone from Google visited your site, twice, and only visited one page and left, that may be a strong indication that "what is above the fold" didn't work for whomever visited. I hit plenty of those sites daily. You know, you do a search, click a result and the first thing you see are ad units - everywhere. Back button! And Google records that action. Just how many "back button" actions do visitors to your site take? What's your bounce rate? Time on site?
I've spent some time reviewing past topics that you've started that are all related to your site(s) being penalized, pandalized, etc. It surely appears that you've got a Made for AdSense network that doesn't meet the 2011 Search Quality Guidelines.
1script, why not clean the template and leave one, yp just one block of adsense in the middle or bottom of the page?
I do very well on a site I have with just one block.
We did remove adsense from one site and nothing has changed so far...the site wasn't using any other ads :(
Thanks for the time you've spent browsing my sites.
|I've spent some time reviewing past topics that you've started that are all related to your site(s) being penalized, pandalized, etc. It surely appears that you've got a Made for AdSense network that doesn't meet the 2011 Search Quality Guidelines. |
I did not answer your previous question about MFA because the question itself frames the discussion in terms that make me look like a common criminal and a thief and that's not really a way to advance any discussion here. I think that my sites provide valuable service that thousands of people use and describing them as a "network", which too comes with a connotation of trying to game the system, is not fair - they are independent and any links between them are either caused by overlapping themes, an oversight or even mistakes on my part. Indeed, some of the links between the sites that Google shows in WMT I cannot even explain (as you've seen in my other thread)
I derive income from (most) sites that I run and therefore you can simply assume that they are MFA - Made For Anything-that-pays-enough-to-support-them because this is not a venue to discuss each site and the actual reasons why I started the site. And yes, some I started with a hope that I will be able to make some money and not much other consideration. Those are clearly my failed projects. Most of those are shut down by now.
I understand that Google reviewer, having what seems just a couple of seconds to decide, may also see the site that way and that's definitely an issue I need to address. Call me naive but I was under impression that AdSense Team sending me messages enticing me to use more ads above the fold would have somehow coordinated their advise with Google's own Search Quality Team but the more I read and post on this forum the less that sounds like a fact and more like a wishful thinking.
In any case, I know great many good sites (most of the software-related forums, come to think of it) that have not only ads above the fold but a whole page takeover ad plus ads above the fold indeed, and that does not lead to their ban. But I digress, this discussion was not supposed to be about the cause of the ban.
Ban or not, I thought it would be interesting to track the reviewers as they peruse one's site, and the OP is one way to do it. If you know of a better way, I'd be interested to know it, too.
Just had an interesting datapoint added to my little research, compliments of Google. So, I'll share here in case someone's still interested in the subject of Google's quality raters.
4 of my reconsideration requests got returned today, all denied ("some or all of your pages still violate our quality guidelines"). All requests were submitted at different times over the course of almost a month. All notifications are boilerplate-identical. All notifications date-stamped with the same time, i.e. they came within a minute from each other.
Of the four sites involved:
- One no longer exist(!). I decided it's not worth the trouble and pulled the hosting account five days after sending the recon request. Funny how they say "still violates the guidelines". Whatever the violation was, it's gone.
- One has been extensively combed over and prettied up, features added, ads reduced, speed improved.
- One has been left virtually untouched (I've found an outstanding DMCA against it, removed the offending content thinking it could be the reason) but ads have been reduced.
- One was untouched but all types of ads completely removed.
None of the three still working sites registers a hit from a Google network from a non-bot since September 1st which for some sites is longer than the response time. They obviously didn't see the dead site - there's nothing to see.
So, I'm drawing these conclusions:
- They don't actually look at the sites (4 sites in 1 minute is not much). My best hope is that they go by some parameter prepared for them by the algo. In which case why the heck they have human reviewers?
- They bunch together all requests by the same webmaster which makes me think "trustworthiness" of the webmaster weighs heavily on the decision
- If anyone is willing to give them a benefit of a doubt that they actually look at anything, they completely rely on cached copy and possibly even a preview snapshot. Otherwise there would have been hits on images or they are not interested in the looks and usability but strictly content and HTML. From now on all my future websites will have <meta name='robots' content='noarchive'/> on all content pages. Otherwise you let them judge you by content of the cache that's been collected at an unknown time, very likely BEFORE you made any changes - when the site gets banned, the bot's activity goes way down and so I assume their ability to refresh the cache.
- The number of ads by itself is not a factor - ads get really bad rap here lately
Did I miss anything?
Most everything you changed was not a guideline violation in the first place - with the exception of the DMCA complaint, perhaps. If the reviewer has an indication of what the violation is, it wouldn't take very long to see it wasn't changed - and if they wait after the Reconsideration Request until the cached copy on the back end is more recent, they might well depend on that.
The deleted site, however - that one's definitely a hoot.
If I have to guess, I am going to go with the type of site. If you got flagged for having a non-desirable-by-Google type of site and is written on their screen, it takes them 5 seconds to see if it's still the same type of site or not.
|Most everything you changed was not a guideline violation in the first place - with the exception of the DMCA complaint, perhaps. If the reviewer has an indication of what the violation is, it wouldn't take very long to see it wasn't changed - and if they wait after the Reconsideration Request until the cached copy on the back end is more recent, they might well depend on that. |
I guess it's possible. But how would you tell the site's still "same type" if you haven't seen most (or any) of it? Whole sections could have been added. Googlebot is largely dormant, so you wouldn't know that from the cache or the index, and you'll have to look. And if you don't, what's the point in reviewing the request in the first place?
|If you got flagged for having a non-desirable-by-Google type of site and is written on their screen |
Anyhow, it was just very curious to me that they did not want to see the actual site as it currently is (no non-bot hits from Google network since before the recon. request). The whole point of reconsideration request is to make changes before submitting and if you haven't seen a changed site, you're just going to automatically reject the request.
Or am I just starting with the wrong premise that the Google quality raters are coming from Google's network? I guess it's entirely possible that they are all telecommuters and come on from their home ISP nets?
what if they take a quick snapshot right before? Who knows.
I think the reviews are done in Mountain View
See if there is any hits on the page "this_page_should_not_exist.fake", then you have the IP. Seems like they sometimes do check for a proper 404 response using this filename.
I want to see the site!
Review My Site
I'm tired of guessing. One look at the site and I'm almost certain myself along with others can pinpoint the challenges. I'll devote an hour of my time to a review. You just have to do your part.
| This 39 message thread spans 2 pages: 39 (  2 ) > > |