| 2:52 pm on Aug 12, 2006 (gmt 0)|
|If G's aim is to eliminate SPAM, then working closely with validated webmasters could strengthen the process of identification and monitoring. |
Indeed! In my process of trying to defend my site against plagerism, I routinely come across faux blogs, faux directories and sub-domain spamming sites that have scrapped and mashed the content of my pages simply to feed to Google. If I thought it would do any good, I would be willing to report these sites on a regular basis. Reporting these site, however, would take a great deal of my time and I don't want to spend my time on doing something that will do no good.
Legitimate websites and Google have the same vested interest in getting rid of the SERP spammers, it would be wonderful if there were a way we could really work together to defeat this problem (or at least keep it i n check).
| 3:55 pm on Aug 12, 2006 (gmt 0)|
Thanks googleguy, that is a good point about other search engines. I have flipped the switch for half of our sites so far and it seems to be working. Serps have gotten better for those sites.
| 4:50 pm on Aug 12, 2006 (gmt 0)|
|While I can understand not wanting to provide this all the time, for legitimate webmasters, it would be very helpful; as we could be being penalized for completely innocent reasons. |
But what if you aren't being penalized? What if there's simply a problem with one of Google's algorithms or filters, and you're a victim of collateral damage? Would Google be able to identify and report the problem to you automatically? I'd assume not, since--if they could do that--they probably wouldn't have whacked you in the first place.
I'd guess that most people who are truly penalized deserve to be penalized, and that a high percentage of those site owners or SEOs know the reasons why they might be penalized. Would the quality of search results improve if "black hat" or "grey hat" Webmasters had access to real-time reports of the specific reasons why they've received penalties? It's more likely that the SERPs would deteriorate (at least in my opinion). And let's be realistic: Providing the best possible search results is a higher priority for Google than keeping Webmasters happy or fed, no matter how important the latter might be to us.
| 5:18 pm on Aug 12, 2006 (gmt 0)|
I think a lot of people assume they are being penalized when actually they are not. I have seen a lot of webmasters over the years assuming they are getting penalized, but when you look at their sites, (which most are very good and content rich) they are missing one or two simple things.
I wish google would expand a bit on their "webmaster guide lines" a bit without hurting their algo secrets. Google can simply explain meta tags, title tags, descriptions, etc...
The real good webmasters can query google and easily figure out how many characters long a title tag and description will display on results. Google should publish that on their guidelines.
Simple things like that would help webmasters and not really tip off spammers.
| 9:11 pm on Aug 12, 2006 (gmt 0)|
trinorthlighting, Google has been adding more documentation at [google.com...] and [google.com...] . Things like [google.com...] can be pretty helpful. I'd expect us to keep adding more information over time.
| 9:39 pm on Aug 12, 2006 (gmt 0)|
Good evening GoogleGuy
It would also help us if you post a short weather report about when do you expect the next PR/BL update to happen.
Long time no PR/BL weather reports, you know ;-)
| 9:43 pm on Aug 12, 2006 (gmt 0)|
Wonderful links Googlyguy.... I'll read it all right now.
| 12:12 am on Aug 13, 2006 (gmt 0)|
reseller, getting a new set of PR/BL out hasn't been a high priority, and I wouldn't expect it to be for a while.
When referring to the end of summer when many data centers would have some more changes, that's the true end o' summer, around the autumnal equinox, i.e. roughly the end of this quarter. I haven't heard much new about the 72.#*$! data center with its infrastructure, but I'll ask.
| 2:25 am on Aug 13, 2006 (gmt 0)|
"What if there's simply a problem with one of Google's algorithms or filters, and you're a victim of collateral damage?"
That's the reason Google's policy is horrible. If you know you aren't penalized for cause (mistakenly or genuinely) then you know your issues are either problems on Google's end or problems on your end. Telling some people is helping only some spammers while being deliberately anti-helpful to everyone not being banned/penalized or whatever they call it.
| 3:03 am on Aug 13, 2006 (gmt 0)|
Ok, probably still me that is staggered by Googleguy (and Google's) recent "communication".
First, it takes a plea (started by reseller) and followed up by many for some communication - after months upon months of posts from worried webmasters. Then the communication is basically not actually what anyone was asking for.
What has been happening during this time is the development (software) of the sitemaps stuff. In particular the ability to choose your own canonical root page! Is it just me or does that option not seem really scary in that they can't sort it without you having to tell them - in a program that millions of webmasters will never be aware of or know about. Inktomi had a similar program, it was paid - but you could submit urls and all that stuff and get feedback on it. Everyone said that Google wouldn't do paid inclusion, and they haven't - but they have done inclusion, and that is just as bad.
I am a programmer - and a really crazy way to work out the "canonical root" of a website could start with:
If "mydomain.com" content = "www.mydomain.com" then site = same.
You could even extend that to looking for "default.htm,html.asp,cfm" etc,etc.
Even better, provide feedback on your urls in the sitemaps program. But of course not all of the or all the time, because "spammers would know they have been caught".
Hello! So you prevent the feedback and help to millions upon millions of people so that a few spammers don't know they have been caught! Excellent customer service angle - I am sure those spammers didn't realise they were spammers and that you telling them would open the doors to a potential 5 billion page problem..... Well they did that without your help...
Are you saying you can't spot spammers - and if not that you might be giving them help!
Because, if you can spot it - why not give a generic message.
And how many serial spammers are using sitemaps - they would only use it to see if you can spot them (I am quite sure they ain't submitting the 5 billion link farmed stuff to sitemaps), so what you are saying is you are willing to compromise your whole service because you are scared of giving feedback to spammers.
Because you can't spot them.
And that raises the issue of what the hell is happening relating to the indexing and crawling change whereby you don't crawl or index a site as deep anymore - you would only do this if you can't spot spammers.
So your whole index quality, webmaster products become a feeble crappy effort. Only a company that is so much of a market leader could do this.
If you are that worried, why not ask for a bit of cash for guaranteed indexing and full feedback review - at least you could please one part of the puzzle.
New Google Corporation = pants. Why do I use it? My mum says she uses it, everyone does cos they always use Google. And always will.
| 4:18 pm on Aug 13, 2006 (gmt 0)|
Since Vanessa requested we post suggestions of what we'd like to see, here's my list... But just to be clear - I like Webmaster Central / Sitemaps - I like it so much I want to see more of it... One or two of these I've mentioned before, and some have been mentioned by others...
1) It would be nice to have the ability to group domains on the "My Sites" page by user defined categories. (Personally, I'd like to group them by client or client type). Sure it gives no value to Google, but it will help webmasters use Webmaster Central.
At a minimum sort first by domain, and then by host, so secure.domain.com will show up will appear next to www.domain.com, not next to someotherdomain.com.
2) Change "Preferred Domain" so it can take a list of hosts that all have duplicate content (with which one should be primary). For example, all of the following might be dupes...
3) Implement the ability to specify default file names so Google will know that http://www.example.com is the same as http://www.example.com/index.htm
4) If you get a 301 on a robots.txt file, do not follow the 301 if it is on a different host. Instead assume that means all documents are disallowed or all URLs for the domain should be redirected to the other domain. The problem is that if people put in a universal 301 for a site, accoring to the robots.txt tool on Webmaster Central, you're following the 301s - even for the robots.txt file and treating the robots.txt file for the new site as if it were the robots.txt for the old site (at least that's how it appears).
5) I'd like to see an option to exclude all URLs for the domain that are not in at least one sitemap, provided there is at least one sitemap and there are no sitemap errors.
6) Or alternatively, a setting that says that URLs not in sitemaps should be treated as if they were submitted with a priority of zero.
7) Related to the last two points - Allow users to remove specific URLs by submitting them via a sitemap and setting their priority to zero (or -1?).
8) Add a parameter to the sitemaps spec so webmasters can specify the old URL for a given new URL if URL mappings change. For example, if all the URLs are changing from .htm to .php, you could specify the old .htm URL for each new .php URL. (Would save a lot of traffic and 301s).
9) A section that shows inbound and outbound links by domain. In other words, what domains does the current domain link to, and which domains link to the current domain? You could think of it as an executive summary of Yahoo!'s Site Explorer.
10) I'd like to see googlebot to observe "crawl-delay" settings in robots.txt like the other three major search engines (not exactly webmaster central feature, but sorta related)
11) Promote Webmaster Central when someone types in site:domain.com to get a broader awareness by people who might be interested in it.
12) Get Blogger to support true sitemaps. After all, they are a division of Google. [While you're at it, get them to allow users to enter a meta description for each page.]
13) Some indication of general characteristics of how you view the site. How easily can you determine the theme/topic of each page? Things like that... Basically, how would you score the site from a technical perspective? The feature that shows common words on the site is a good example of what I'm talking about, though I've noticed it's not there for every site. Knowing why it's not there would be helpful.
14) The ability to define our role with the domain 1) owner, 2) webmaster, 3) consultant, etc. and then (behind the scenes) use that information to determine how trustworthy sites are based on trustworthiness of other sites the person is involved in and their relationship to those other sites. In a way this is more reliable than a link between sites since you've validated that we have a significant role in each of the sites.
| 6:20 pm on Aug 13, 2006 (gmt 0)|
Hi Swanson! Google will try to interpret domain.com and www.domain.com correctly, and in my experience it normally gets it right. If you've got a fast-changing page, any search engine would be less likely to see those pages as dupes, because the page is more likely to have changed between fetches to domain.com and www.domain.com (remember that domain.com and www.domain.com can be completely different documents because they're different urls). So the new feature in Google Webmaster Tools is intended to help the smaller number of people who still have an issue with www vs. non-www.
I think reseller's post was a good reminder that a lot of discussion happens in forums and that we should stop by whenever we can to answer questions. But I also think that Google has been working hard on features/communication for webmasters in the last few months. Matt's site answers questions by web and sometimes video. Google has introduced features like the NOODP tag to give webmaster more control, and we've been working hard to refresh supplemental results as well. Plus the webmaster console has been steadily adding features for months.
If you look at the amount of info that the webmaster console has begun to provide, it's a pretty long list, and the team listens to feedback about what to implement next. The team has added crawl errors (November 2005), a robots.txt checker (February 2006), downloading info in CSV format (March 2006), showing some penalties plus offering a reinclusion request form and allowing site verification via meta tags (April 2006), more crawl errors and query stats (June 2006), and www vs. non-www and even more crawl errors (August 2006). So I'd disagree that we're providing less info/communication than before--I'd say we're doing more. But I would agree that even in a bloggy world, it's important to remember that forums such as WebmasterWorld are very popular, and we should try to answer questions on forums as much as we can.
steveb, I'm really glad that we started alerting some site owners to penalties in April, which augmented our original test of emailing webmasters (those emails still continue too, by the way). I think the situation now in which some legit sites can find out if they have penalties is much better than several months ago when we didn't have the opportunity to offer that. I expect Google will keep looking for ways to alert legit sites to penalties without tipping off spam sites.
P.S. Thanks for the suggestions, jay5r. I liked them a lot. :)
[edited by: GoogleGuy at 6:24 pm (utc) on Aug. 13, 2006]
| 7:44 pm on Aug 13, 2006 (gmt 0)|
> www vs. non-www and even more crawl errors (August 2006).
Yes indeed, one of the 'even more crawl errors' was the 403 forbidden error on directories, which I mentioned before. As with todays brief look at my sitemap-account these http-errors are not listed any more. Thanks to the crawl team for that, my account looks a bit tidier now, and I again get the impression that our hints here are carefully read.
jay5r's list is quite exhaustive. Additionally, as I mentoned before, the source producing 404-errors would be interesting. I suspect this is currently investigated carefully by the crawl-team.
Instead of flowers I might send pizza-widgets, but I'm afraid the ministry of love won't let them thru by air freight. Reseller, do you have an inner-European address at hand (or would that be viewed as a bribe)?
| 12:39 am on Aug 14, 2006 (gmt 0)|
"I expect Google will keep looking for ways to alert legit sites to penalties without tipping off spam sites."
I hope you stop. There should only be a one way communication, that is, a webmaster can ask if there is a penalty. And most importantly, ANY webmaster should be allowed to ask, not just those who have and admit to violating guidelines and spamming. It is crazy that you allow cheaters privileges that honest webmasters are not allowed access to. The reinclusion request is the sickest thing Google does, but taking that a step further by saying "take off" to honest webmasters while helping cheats in yet ANOTHER way is just rude policy.
Why does Google treat spammers better than non-spammers? Why does Google give spammers tools it deliberately refuses to non-spammers?
| 1:06 am on Aug 14, 2006 (gmt 0)|
steveb, let me talk through some typical cases and see if we still disagree. Suppose there's a legit mom/pop bed and breakfast website that has hidden text, and they're about to leave Google's index. Wouldn't it be nice to drop them a note that says "Here's why your site is gone, and here's how to do a reinclusion request if you decide to change your site"? On the other hand, if there's a someone spamming highly coveted terms and they are deliberately doing something pretty bad like cloaking or sneaky redirects, why would you want to alert the spammer that they've been detected?
Just to be clear, anyone can verify that they control a site, and then request to see if they have spam penalties. The first case (the real bed and breakfast site run by a mom/pop) would be more likely to get an affirmative answer than the second case (the deliberate spammer). So it is the case that anyone can ask if they have a penalty, and the more reason we have to believe that a site is legit, the more likely it is that we can give information to the owner of that site.
Let me try to anticipate what else you might be objecting to. If you object to the fact that we provide a reinclusion request form, I think it's important to provide a way for people to ask that they be reincluded in Google. If your objection is that the reinclusion form implies that the webmaster is guilty of spamming, we recently updated that form to include a common case (the domain was acquired by you, and you had nothing to do with the previous behavior of the site)--so that form doesn't automatically imply that the site owner is requesting reinclusion because they were spamming.
On a completely different note: Oliver, I know that the Sitemaps folks do read here and on blogs and groups and respond to suggestions. I'm glad that the 403 on directories isn't reported now. I know that I've personally requested to see the source page that had a broken link causing a 404, so it's on the Sitemaps team's list. I know they've got several things that they want to do though, so I don't know the priority of that particular request.
| 1:32 am on Aug 14, 2006 (gmt 0)|
What do you think about putting a message on sitemaps if google finds no penalties at all on a site?
If webmasters saw that type of message then at that point they would start building content to raise rankings.
| 2:20 am on Aug 14, 2006 (gmt 0)|
"Wouldn't it be nice to drop them a note"
Absolutely not. Why on Earth would you tell a spammer they are about to be punished? Why on Earth do you give spammers a "free shot" to get away with spamming, knowing THEY can get a second shot via a reinclusion request, whereas non-spammers can NOT have that option.
Why does Google help spammers but not help non-spammers. It's a dumb, rude, hugely counterproductive policy.
It's one thing for you to not penalize spammers that you should. It;s entirely another, much worse policy, to offer spammers help you refuse to honest webmasters.
"Just to be clear, anyone can verify that they control a site, and then request to see if they have spam penalties."
"If you object to the fact that we provide a reinclusion request form, I think it's important to provide a way for people to ask that they be reincluded in Google."
Not people. Spammers. ONLY spammers. You have no way at all for non-spamming webmasters to ask that they be reincluded in Google. That stinks.
"--so that form doesn't automatically imply that the site owner is requesting reinclusion because they were spamming."
The form only allows a person to use it for sites that have violated your guidelines. That is just sick.
And this isn't news either. People have posted for years on webmasterworld, literally almost crying for a way to get a site problem or penalty reviewed. Honest webmasters are willing to pay for this. But Google refuses to offer this. Instead Google bends over backwards for spammers, offering them a free shot to spam, and offering them multiple tools that Google refuses to honest webmasters.
| 2:41 am on Aug 14, 2006 (gmt 0)|
Cute answer Googleguy.. prove what you have to say.
| 2:53 am on Aug 14, 2006 (gmt 0)|
|The form only allows a person to use it for sites that have violated your guidelines. That is just sick. |
Okay, I'm just a dumb layman, but try this on for size:
As I mentioned earlier, it's easy for Google to know if a manual penalty has been applied. But what if the site has hurt been by a glitch in an algorithm or filter that's been created by a "black box" (as described by Ronburk in his many informative posts about data mining)? In such a case, there probably isn't a simple explanation such as "Oh, yeah, you had hidden text on your home page, so we hit the 'Apply Penalty' button." It's likely that no human even knows why your-squeaky-clean-site.com was affected.
| 5:03 am on Aug 14, 2006 (gmt 0)|
Good morning All
>>Suppose there's a legit mom/pop bed and breakfast website that has hidden text, and they're about to leave Google's index. Wouldn't it be nice to drop them a note that says "Here's why your site is gone, and here's how to do a reinclusion request if you decide to change your site"?<<
Sure its nice and generous to do so. Because, for example, it could be that its the webmaster who takes care of the mom/pop sites is spamming without the knowledge of those innocent mom/pop.
And sometimes its the "SEO" firm, that mom/pop hire in good faith, that do the spam without the knowledge of the site owner. Allow me to recall an example which Matt published on his blog.
SEO Mistakes: crappy doorway pages [mattcutts.com]
[edited by: reseller at 5:33 am (utc) on Aug. 14, 2006]
| 5:13 am on Aug 14, 2006 (gmt 0)|
"It's likely that no human even knows why your-squeaky-clean-site.com was affected."
Fine, then that is the answer sometimes. Just because this is possible is certainly no reason that a question is not allowed to be asked!
| 5:50 am on Aug 14, 2006 (gmt 0)|
Ok I see that the link of the example I posted in my previous message (#:3045247) isn't working properly. Here it is again together with another relevant example:
SEO Mistakes: crappy doorway pages [mattcutts.com]
Notifying webmasters of penalties [mattcutts.com]
| 8:01 am on Aug 14, 2006 (gmt 0)|
IMHO on the topic of spammers being warned that their infringement has been spotted, I feel I fall on the side of Steveb. I think an internet policed by a bunch of do-gooders would result in more spam and not less. You need the threat of total dismissal and the resulting loss of earnings to keep these spammers in line. If a spammer knows he's going to get a second chance, then I'm afraid it's only human nature to run the gauntlet until he is caught.
In the UK today, our crime is out of control, IMHO due to the same liberal policies being utilised ... too many warnings and not enough hard time.
In the bad old days, I used to deliberately pack pages with fully sentenced keyphrases. After 2 years of getting away with it, I would remove the keywords and the pages would sustain their rankings. I'm ashamed to say that some of my past clients are still number one for huge key phrases and they still use my particular technique.
So, for this Google, I'm sorry ... but with your resources you should be better at spotting such obvious spam.
All the Best
| 8:41 am on Aug 14, 2006 (gmt 0)|
I have the impresion that the purposes of "Filing a Reinclusion Request" aren't covered as they should be. Maybe because they aren't so clear.
However, webmasters like "Trisha" who doesn't know what he/she had done wrong, can also file a reinclusion request.
From Matt's blog:
Filing a reinclusion request [mattcutts.com]
Here is what Trisha asked, together with Matt's reply.
September 18, 2005 @ 12:16 pm
What if you really donít know what it is that you have done wrong? In my case Iíve got two sites which lost almost all their Google referrals since Bourbon. Iíve taken down anything I can imagine that might have been a problem (none of it done to spam Google). What will happen if I fill out a reinclusion request but there is still something on the site that Google doesnít like? And will the penalty ever go away without filling out a reinclusion request?
And here is Matt's reply:
September 18, 2005 @ 11:11 pm
Trisha, good question. If thereís an algorithmic reason why your site isnít doing well, you can definitely still come back if you change the underlying cause. If a site has been manually reviewed and has been penalized, those penalties do time out eventually, but the time-out period can be very long. It doesnít hurt your site to do a reinclusion request if youíre not sure whatís wrong or if youíve checked carefully and canít find anything wrong.
| 9:19 am on Aug 14, 2006 (gmt 0)|
I've got to agree with the consensus of feeling here about being able to ask a question and get a straight answer in GWC.
trinorthlighting's suggestion would be a good start and then Google get's webmasters batting out for all the right reasons and focus.
I understand it's a fine line betweeen Google's secret algo and full disclosure, but there is so much ambiguity in the interpretation of guidelines that the average Mom & Pop webmaster would likely need some clarification.
If it's a matter of money, people will cover the cost - I'm sure.
If it's a matter of validation to avoid spammers taking advantage of the situation then i would have thought there would be some steps which could be considered over time to better handle that that process.
Again - i have to congratulate these steps that you're taking and the things that you have done , but we need a better process to unlock the quality of Q & A communication on penalties through GMC - but it looks like you're onto it - i guess we just need to hear that you're working on it :)
| 11:19 am on Aug 14, 2006 (gmt 0)|
I say just stay quiet about your webmaster requirements, but just make sure you reward the cleanest sites with the top positions. Surely not too hard to program into your next data push?
All the Best
| 1:01 pm on Aug 14, 2006 (gmt 0)|
I think if google should show the following for pages when it gets crawled:
1. Google has found no penalities.
2. Google found missing or to long title tag
3. Google has found no or to long meta tag
4. Google has found no or to long description tag
5. Google has found duplicate content
6. Google has found broken link
7. Google has found errors on html
8. Google has found hidden text
Those 8 right there would not give any of the secret sauce away and would keep the mom and pop's busy. If google were to specify that per page, the honest people could troubleshoot the page and fix it. Honestly, how many spammers use sitemaps?
| 2:20 pm on Aug 14, 2006 (gmt 0)|
|5. Google has found duplicate content |
Excellent idea Tri. I especially like this one as it seems to be an issue nowadays.
|Honestly, how many spammers use sitemaps? |
Ironic that spammers use the same philosophy as G.- "Automate Everything" and therein lies the problem and solutions.
The previous posters are correct, GG. It's silly to deny the cream rising to the top just because a very few people will take advantage.
There's a huge gap between helping quality, well-intentioned sites hold appropriate and natural rankings and spilling your entire algo recipe.
If your algo is as complex as you claim, I'm not exactly sure what you all are worried about at the plex anyways.
The spammers who are succeeding right now certainly aren't ex-G employees with trade secrets.
One would assume that giving white hat owners the appropriate information to rank properly/fix penalties would also lessen the ability for spammers to rank in the first place, yes?
| 2:28 pm on Aug 14, 2006 (gmt 0)|
So it seems that we need to do some more "development work" on the present reinclusion request page (can be accessed through Sitemaps).
At present we have:
Request reinclusion of a site that has violated the webmaster guidelines [?]
IMPORTANT: Please complete this form ONLY if one of the following is true:
* Upon reviewing your site, you found that it violated our webmaster guidelines and you've made changes to your site so that it adheres to the guidelines. [?]
* OR You recently acquired a domain which you suspect may have previously violated our webmaster guidelines.
Maybe we can just delete or modify the line:
"Request reinclusion of a site that has violated the webmaster guidelines [?]"
Then add a third point allowing those "non-spamming webmasters" to file a reinclusion request too.
I.e we need a text for a third point. Any suggestions?
| 2:53 pm on Aug 14, 2006 (gmt 0)|
I like the duplicate content as well. If that message was flagged to us we would look internally first and clean things up.
Then, we would look externally. We all know spammers scrape content.... So if we all the sudden were flagged and found the content scraped, and it was one of our sites unique content protected under the DMCA, we could then let google know that and they could take the appropriate action.
Google could take that information and use it in the "spam wars" as I call it and ban the spam sites. It would be a great tool in the "spam wars" and since google would take action to ban sites, a lot of webmasters would be very happy.
| 3:01 pm on Aug 14, 2006 (gmt 0)|
We have tried everything to include reinclusion requests, url removals, robots.txt entries, 301s, you name it, we have tried it and yet our site still remains penalized.
This is what really frustrates webmasters.
| This 167 message thread spans 6 pages: < < 167 ( 1 2 3 4  6 ) > > |