Forum Moderators: Robert Charlton & goodroi
Thanks to the folks on this site, I think have done almost everything to take care of my duplicate content and canonicalization problems. This is what I have done so far:
1. Got rid of cruft (old, unused files) in the web directory that might be causing duplicate content penalties
2. Setup a robots.txt file to disallow folders that I don't want spiders indexing
3. Set preferred domain in google to use www
4. Put in 301 redirects in my .htaccess file that first redirects index.php to /, and then redirects http://example.com/ to http://www.example.com/
5. Tested server headers with online tool to make sure redirects work properly
6. Used Link Sleuth to check for broken links
7. Created and uploaded a sitemap to google.
Two sitemap generators, including Link Sleuth, included both
http://www.example.com/
http://www.example.com/index.php
in their sitemap. So I used both also. But is this right? Is this going to tell google to index and list them separately? There really is only one index file on my site.
Also, I do have a lot of internal links pointing to index.php. If I have already done the redirect above, is it important to change these? If so, would I change this:
<a href="index.php">link text</a>
to this:
<a href="/">link text</a>?
Don't ever mention "index.php" again on your site; Having both that and "/" is duplicate content on what is probably your most important page. Do not confuse the spiders...
The wording here is ambiguous. We'll hope you have:
4. Put in 301 redirects in my .htaccess file that redirects index.php to www.example.com/, and also redirects example.com/<anything> to www.example.com/<anything>
Jim
Yes, the wording was ambiguous. I didn't want this forum to get hit with a duplicate content penalty : )
I got this from several places in these forums.
_______ .htaccess file ________________
RewriteEngine On
# redirect index.php to / (should be before the non-www redirect to avoid recursion)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ http://www.example.com/$1 [R=301,L]
# redirect non-www to www
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule (.*) [%{HTTP_HOST}...] [R=301,L]
# below are dupe content pages that have significant age that I want to redirect to new pages
Redirect 301 /file-old.php http://www.example.com/file-new.php
... more
_______________________________________
Does this look right? Will google ever post this kind of information on their site?