Page is a not externally linkable
- Google
-- Google News Archive
---- Google penalty questions.


soccer_star - 2:00 am on Sep 20, 2003 (gmt 0)


Going by my own experiences with Google, I would answer as follows:

1) If you have two completely different domains with identical index.html's that do not link to each other, and googlebot finds both of them through links from other domains, does Google have some way of detecting identical pages like this?


Yes. Don't do it, Googlebot will spot it eventually and penalise you. I have a similar page on each of my sites and disallow Googlebot from all except one of them to eliminate the risk of a duplicate penalty.

2) Does the .php extention give any penalty? I've heard it doesn't, but I changed my www.domain/index.html to index.php after it was already indexed and now my page seems to have vanished from Google (by vanished I mean that I cant find my site even when I search for a string unique to my site). However, besides changing the extention (and adding a simple counter and tracking system) I did change the site quite a bit - could this have caused that change in my Google listing (perhaps by being considered a dynamic web page)?

Although I'm no expert on php pages I don't think they carry any penalty at all. From what I've read it's sometimes harder for Googlebot to follow links than from a static html page but I doubt that's the reason your page has disappeared. If it's a fairly new site (less than 8 weeks old) it's quite common for it to keep disappearing and reappearing again.

3) How different does a site have to be so that its not considered duplicate?

Ahhh, the 64 million dollar question. :) Nobody really knows. I've read figures of at least a 20% difference and for the linking structure to be slightly different etc. Basically, you are usually safe if there is a logical reason for your pages to be similar, as opposed to them being similar just to target similar keyphrases.

For example, if you are selling 'apples' on apple.htm you can say how good fruit is for you, how cheap an apple is, how tasty apples are etc. Then if you are selling 'pears' on pears.htm it is still valid to talk about how good fruit is for you, how cheap a pear is, how tasty pears are etc.

But if you create another page for 'juicy apples' on juicy-apples.htm just to target the keyphrase 'juicy apples' and end up with a very similar page to 'apples', you will be in trouble because there is no real need for the 'juicy apples' page as you've already covered the virtues of apples on your 'apples' page. That's when you start getting into dangerous territory because even though Googlebot may not pick up on it, a competitor may report you and a Google employee may decide to penalise you.

If another common name for widgets is thingamajigs, and you have one page optimized for "widgets" keywords, and then you create another page optimized for "thingamajigs" keywords by simply replacing "widgets" with "thingamajigs", will Google determine they are duplicate pages?

I did exactly that with two different sites. Even though I rearranged the pages so they were cosmetically different with different colors, fonts and altered the link structure slightly, it was essentially the same site with the same content and I must confess I did it purely to harvest extra keyphrases.

Google kicked the duplicate site out of the index within two months.

Lesson learnt, I now realise it is more prudent to play by the rules and err on the side of caution if you're in this for the long haul.

4) Do sites get de-listed from Google often? If so, for what kinds of offenses?

All the time - this post is already too long to even scratch the surface of what types of penalties there are. There are loads of techniques people use to try and hoodwink Google (search this forum for cloaking, crosslinking, duplicate sites/pages, hidden text and redirects for starters).

The golden rule is to question whether you are doing something for the good of the surfer or for the good of your rankings. It's possible to achieve both (heck, that's what good SEO is!) but if you're ignoring the surfer's needs and doing something just to boost your PR or ranking, you run the risk of a penalty.


Thread source:: http://www.webmasterworld.com/google_archive/17076.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com