Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Duplicate content from "index.html"? - is it a Panda issue?

         

FlyOcean

6:56 am on May 12, 2012 (gmt 0)

10+ Year Member



My homepage is indexed by google as
mysite.com/ and mysite.com/index.html

Is this duplicate content ? (google panda update)

tedster

1:55 pm on May 12, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello FlyOcean, and welcome to the forums

Yes, those two URLs are a kind of duplicate content - something Google calls a "canonical URL problem. The best practice for the canonical issue you mentioned is to redirect all URLs that end in index.html (or default.aspx or whatever) to the directory level or root level URL without the "index.html"

Here's a reference thread that lists many kinds of URL problems, including yours:
Canonical URL Issues - including some new ones [webmasterworld.com]

It's better to make sure your server doesn't generate ANY canonical problems at all. Here's a thread about how to do that on the Apache server:
A guide to fixing duplicate content & URL issues on Apache [webmasterworld.com]

Another fix, especially useful when you don't have good server access, is to use the canonical URL meta tag:
Search Engines Agree on "Canonical tag" [webmasterworld.com]

And finally, I personally don't think this kind of canonical URL issue has anything to do with Panda at all. Google is pretty good at handling this kind of canonical error today and not seeing it as a spam attempt. But it's still best not to put that responsibility anywhere elase but your own shoulders :)

g1smd

2:10 pm on May 12, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Make sure the site links only to "/" and not to "/index.html".

Add the rel="canonical" data to the page, or set up the 301 redirect in the site configuration.

It will take Google at least a month to fully factor in this change. Patience is required.