Page is a not externally linkable
europeforvisitors - 4:25 pm on Oct 3, 2005 (gmt 0)
That might be true if the pages consisted mostly of duplicate content, but it certainly isn't a true statement in general. In the "bee" example, Google might think (not without reason) that pages were duplicate content, and that the intent was to spam the index, if all five pages consisted of virtually the same text with just a few words changed here and there. E.g.: Peruvian honeybee: A black-and-yellow bee with a pointy stinger and hairy feet that buzzes in Spanish, generic text generic text generic text... Brazilian honeybee: A black-and-yellow bee with a pointy stinger and hairy feet that buzzes in Portuguese, generic text generic text generic text... Irish honeybee: A black-and-yellow bee with a pointy stinger and hairy feet that buzzes in Gaelic, generic text generic text generic text... Now, there might be legitimate reasons for using this kind of duplicate content, but Google can hardly be faulted for assuming that such patterns are artificial. And if there's a 97% statistical likelihood that such blatant duplication is the result of aggressive SEO and merely clutters the index with boilerplate content, then is it so unreasonable for Google to simply filter all the pages and rely on reinclusion requests to correct the few instances where the duplicate content might be legitimate and in keeping with Google's stated mission? In situations like the bee examples above, why shouldn't the burden be on the publisher to demonstrate that the duplicate content is legitimate and of value to users?
So now if we webmasters post more then page on a topic like 'bees', then Googles thinks we are spamming web page.