There are a few options for your situation:
Option #1 - Move that content into images. Google can not read images and thus will not see the misspellings. A downside is that visually impaired people using your site will likely be impacted as well.
Option #2 - Include this information in an iframe that is within a directory blocked from Google. This technique is used by spammers and you might accidentally trigger a Google spam filter.
Option #3 - IP delivery aka cloaking. When Googlebot comes crawling show them a different version. Downside is that cloaking is not the easiest thing to do properly and Google is very sensitive to it since cloaking has been used to spam.
> I wonder if I am in fact being penalized by taking on these challenges - but having a large amount of misspellings.
Panda seems to dislike them, according to the Google Webmaster Central Blog:
More guidance on building high-quality sites
Friday, May 06, 2011
. . .
"Does this article have spelling, stylistic, or factual errors?"
wow...come one google there are sites which even teach English grammar and spellings with examples. Don't pandalize them. Don't impose your grammar requirements on the web for everyone.
I would not jump to the conclusion that these type of misspellings will result in panda penalties.
I am assuming you have significant content on each page and that you only have a handful of intentional misspellings on the page for the user benefit. If a user came to this and saw the misspellings would they spend more time on the page and generate a lower bounce rate because the intentional misspellings were helpful (liking learning how to pronounce a new word)? If so I would not worry about it as long as the other paragraphs on the page were all perfect.
People's names are not in a dictionary and will get caught in a misspelling search. I don't see Google penalizing a website simply for having words that are not in a dictionary.
I could see a website be penalized for having many accidental misspellings since that makes it less likely for other websites to link to you, have your visitors refer people to your site and in general people won't view you as an authority and spend time on your site.
@goodroi - I like your third option. It seems most viable to block the content from googlebot. I guess things can't get worse then being 'pandalized', so it might be worth a shot, risking the possible cloaking penalty.
|I would not jump to the conclusion that these type of misspellings will result in panda penalties. |
I am assuming you have significant content on each page and that you only have a handful of intentional misspellings on the page for the user benefit.
Actually amount of content that is not misspelled likely comes close to the content that is misspelled in many instances. Probably averages 35-45% misspelled (a few are upwards of 80%). But I think my bounce rate is likely lower then competitors on these pages despite the misspellings. BUT mind you, I am no longer ranking up with my competitors. I used to be number 1-2. Now 2-3rd page.
|People's names are not in a dictionary and will get caught in a misspelling search. I don't see Google penalizing a website simply for having words that are not in a dictionary. |
I have figured that google can't know whether something is a proper name/noun. The thing that delineates proper names/nouns is a capital first letter. In some instances (where feasible) I have capitalized the misspelled words in hopes that google will see it as proper, and therefore ignore the misspelling. Obviously this only 'reduces' the amount of misspellings in google's eyes.
We are working on getting more natural links to us (all are natural, we are just passively requesting more).
@potentialgeek - I did read that (over and over and over again) when it came out, but didn't realize how badly I could be effected by that until I ran a spell check in the site.
@indyank - oh, believe me - I agree. When you penalize a WHOLE site for errors on a couple of pages, some sites might get more of a penalty then they are due.
I've been looking for some kind of mark-up that might work in this situation - making it semantically clear that the content of the element is exceptional. Something like the <ruby> element [w3schools.com] which is used for East Asian pronunciation mark-up would do nicely, but I haven't stumbled on anything yet, not even in the new schema.org vocabulary.
There's a chance that the <dfn> element [w3.org] might serve your needs. At any rate, what I would do is mark-up each of those instances on your pages in some consistent and identifying way - at least with a unique span.class that sets them apart.
Then a reconsideration request explaining the situation could possibly get some human intervention for the situation.
No, it's a capital first letter not preceded by a sentence-final period-- as distinct from a period after a title such as Dr., Ms. or Lt.-- or paragraph break. In English, that is. I don't know how careful g### is, but if you can do it yourself with a simple RegEx, you'd think they could do the same.
|The thing that delineates proper names/nouns is a capital first letter. |
Is it cheating to wrap the whole thing in a div that says "lang=" ... something other than "en"?
I think tedster has the best idea. Cloaking could just get you penalised even more.
I cannot see that any tag is designed for this. <q> or <var> seems nearest.
I know it's not strictly what it's for but could you wrap the words and phrases in code tags? If g IS spell checking it must know not to penalise sites for things like variable names in code samples. Although I suppose there is a hypothetical risk that it may end up mis- classifying your site as being programming related.
|I've been looking for some kind of mark-up that might work in this situation - making it semantically clear that the content of the element is exceptional. |
In HTML5, the <u> element may be utilized in this scenario.
|The u element represents a span of text offset from its surrounding content without conveying any extra emphasis or importance, and for which the conventional typographic presentation is underlining; for example, a span of text in Chinese that is a proper name (a Chinese proper name mark), or span of text that is known to be misspelled. |
u – offset text conventionally styled with an underline
|Changes in HTML5 - Although previous versions of HTML defined the u element only in presentational terms, the element has now been given the specific semantic purpose of representing text “offset from its surrounding content without conveying any extra emphasis or importance, and for which the conventional typographic presentation is underlining”. |
pageoneresults - that sound great... and exactly what I need (I think) so long as I switch over to HTML 5.
Anyone know if this interpretation is followed by G, and anyone else want to affirm if it is indeed what I should do to get G to ignore spelling errors? I have about 10,000+ static pages that would need updating to HTML5 (I'm XHTML trans right now).
As long as your pages are not rendering in quirks mode, all you need to do to have HTML5 is change the DTD to <!DOCTYPE html>. It automatically triggers Standards Mode in browsers, and it is backwards compatible with previous versions of HTML and xHTML mark-up.
Nice find, pageoneresults!
|Anyone know if this interpretation is followed by G |
I certainly don't, but it's quite worth the adventure to find out, IMO.