Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Should I list index.php or "/" in the sitemap?

         

momo

12:42 pm on May 12, 2008 (gmt 0)

10+ Year Member



I am wondering if I should list index.php in my sitemap, or just the / url.

Thanks to the folks on this site, I think have done almost everything to take care of my duplicate content and canonicalization problems. This is what I have done so far:

1. Got rid of cruft (old, unused files) in the web directory that might be causing duplicate content penalties
2. Setup a robots.txt file to disallow folders that I don't want spiders indexing
3. Set preferred domain in google to use www
4. Put in 301 redirects in my .htaccess file that first redirects index.php to /, and then redirects http://example.com/ to http://www.example.com/
5. Tested server headers with online tool to make sure redirects work properly
6. Used Link Sleuth to check for broken links
7. Created and uploaded a sitemap to google.

Two sitemap generators, including Link Sleuth, included both
http://www.example.com/
http://www.example.com/index.php

in their sitemap. So I used both also. But is this right? Is this going to tell google to index and list them separately? There really is only one index file on my site.

Also, I do have a lot of internal links pointing to index.php. If I have already done the redirect above, is it important to change these? If so, would I change this:
<a href="index.php">link text</a>

to this:
<a href="/">link text</a>?

htdawg

1:20 pm on May 12, 2008 (gmt 0)

10+ Year Member



It's best to link to / instead of index.php and try to keep all your internal & external links consistent use either / or index.php

jdMorgan

1:27 pm on May 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The sitemap generators mention "/" and "index.php" simply because you still have incorrect links remaining to "index.php." Get rid of them, and as you surmise, replace them with "/".

Don't ever mention "index.php" again on your site; Having both that and "/" is duplicate content on what is probably your most important page. Do not confuse the spiders...

The wording here is ambiguous. We'll hope you have:
4. Put in 301 redirects in my .htaccess file that redirects index.php to www.example.com/, and also redirects example.com/<anything> to www.example.com/<anything>

Jim

momo

3:11 pm on May 13, 2008 (gmt 0)

10+ Year Member



Ok, that makes sense that both "/" and "index.php" shows up in the sitemap since it is referenced inside of my html. I replaced all instances of index.php with /, reran the sitemap generator, and it only included the /.

Yes, the wording was ambiguous. I didn't want this forum to get hit with a duplicate content penalty : )
I got this from several places in these forums.

_______ .htaccess file ________________
RewriteEngine On

# redirect index.php to / (should be before the non-www redirect to avoid recursion)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ http://www.example.com/$1 [R=301,L]

# redirect non-www to www
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule (.*) [%{HTTP_HOST}...] [R=301,L]

# below are dupe content pages that have significant age that I want to redirect to new pages
Redirect 301 /file-old.php http://www.example.com/file-new.php
... more
_______________________________________

Does this look right? Will google ever post this kind of information on their site?