Forum Moderators: Robert Charlton & goodroi
I have been recently banned by google in the last month and failed to be reincluded since then. The first suspecious of getting banned was "duplication content" and as been advised from several webmasters here, I have modified robots.txt so googlebots should no longer index this duplication content.
Now the $32.000 questions is, should google ban you for this although you are telling their crawlers not to spider these contents?
Cheers,
mOftary
>>Now the $32.000 questions is, should google ban you for this although you are telling their crawlers not to spider these contents?<<
Of course not. Its Google´s fault not yours if they keep indexing pages you asking not to get indexed.
However, I would have also added the following meta tag:
<META NAME="GOOGLEBOT" CONTENT="NOINDEX, FOLLOW">
Just to be sure ;-)
My suspicion is that Google looks for anomalies where a site has a SEO features outside the norm. E.g., a higher than average level of duplicate content, a disproportionate amount of hidden text in alt tags or keywords in unnecessary meta tags, folders banned to search engines, the overuse of Hx tags compared to the total amount of text on the page, etc. One or two anomalies can be accidental or white hat, but too many anomalies and a site may be considered as spammy or over SEO'd.
By having duplicate content on your site and then taking measures to stop Google indexing it may in itself raise Google's suspicions.
Anyway, I have absolutely no SEO tricks, white hat or black ones. Nothing but a suspected duplication content and some site-wide-links. I took the site-wide-links down although I dont know why would google ban a site for this when every major company use site-wide-links to market their websites (check devshed or internet dot com).
By having duplicate content on your site and then taking measures to stop Google indexing it may in itself raise Google's suspicions.
Isnt it enough to tell googlebots not to spider the suspected duplicated content?
BTW, these modifications of robots.txt were purposed to get rid of the ban. They are certainly not causing the ban as I had an allow all robots.txt before.
Elaborate please! Should I drop these contents for the eyes of google?
That decision is completely up to you. Nobody can tell you what Google likes or dislikes with any certainty. I just prefer to play safe. When I lost all my traffic (twice this year) I removed anything that might be considered black hat, including all my print pages. They might have been seen as duplicate content, and if I banned Google from them that might also have aroused suspicions. The traffic is now back, although there is no guarantee that what I did has anything to do with it.
But I prefer to stay squeaky clean and on side with Google - there is no point in considering your visitors without SE's to drive traffic to your site.
Your print pages could be kept out of the index by putting <meta name="robots" content="noindex"> on each one. It is very disconcerting to click on a search result and have your printer start up before the page has finished loading on screen.
Your print pages could be kept out of the index by putting <meta name="robots" content="noindex"> on each one
Yes and no. Google knows the url of these pages because it will have followed a link to them. It should follow the noindex rule, but the url will still lurk somewhere in the depths of Googleland. And from time to time, especially when Google does a rollback, some of the print pages may appear in the serps - url only without a snippet. I have had some print pages pop up in serps after a year has elapsed.
Also just because Google follows the noindex rule, it doesn't necessarily follow that Google hasn't crawled the page.
The way to go is not to have seperate print pages but to use a CSS solution.