Forum Moderators: DixonJones
P.s. I use exact url's in my site navigation e.g., xyz.com/blue-widgets/index.html if that makes any difference.
If you see two entries for the same client request, and the first one indicates a 301, 302, 303, or 307 redirect response, while the second one shows a 200-OK or a 304-Not Modified response, then that indicates that you've got a redirect on your server from the first-requested URL-path to the second.
If you're saying there are two different 'kinds' of log entry for the same 'home page' and you only want to see one 'style' or the other, then that is something you need to fix in your linking.
BTW, it is considered 'best practice' to omit "/index.html" from all links, and simply link to "/". Defining index.html as your DirectoryIndex obviates the need to link to /index.html.
If none of the above is helpful, please post a small number of relevant samples from your log file, so we can discuss things more specifically.
Jim
If both of them work and directly serve content, then you have a Duplicate Content issue.
You should link to "/" and you should set up a 301 redirect so that requests for "/index.html" are redirected to "/".
This has opened up a whole new can of worms for me. You are both right as all my internal linking is done as dir/index.html as opposed to just dir/. As it turns out, this is exactly why I'm getting duplicate entries. My site has been online since 1998 and I can't believe this issue is just coming up now as I've always done my internal linking this way.
I'm contemplating making changes e.g., removing .html on all directory links. I'm a little afraid of what this change will do to my serp's (specifically on "G"). Do either of you think this change will have an adverse affect on them? Could you give me a few pointers on the best way to make these changes without screwing up my serp's? e.g., should I just strip the .html, or also do a 301 for the links I change?
Also, on another note, my site map is set up the same way with dir/index.html. Should I also change it to just dir/? Sorry to sound like such an idiot but this has really thrown me for loop.
Thanks again for the help.
Do a search here on WebmasterWorld (link at top of every page) for subjects such as "canonical domain", "canonical URL", "change /index to /" (or root), "duplicate content", and the similar phrases you'll find in these threads -- There are many threads on these subjects here, some of them very recent [webmasterworld.com], but older ones of value as well. The Library for each forum can be of help as well. Some forums also have a "hot topics" thread pinned at the top of their individual thread lists, and their Forum Charters may contain links to useful references.
Reviewing these resources will help you get up to speed on these issues, and allow an informed decision.
That said, if this were my site, I would start by changing "/xyz/index.html" links to "/xyz/" on a small percentage (say 10 to 20%) of your lowest-level pages, and adding specific redirects for each, working up toward the top of your site's structure. The idea is that changing the lower-level page URLs won't "hurt" as much if they lose ranking temporarily.
Once the URLs for the lowest-level pages have all been changed, re-indexed, and return to normal ranking, then work up to the next level. By the time you get to the top-level pages, the lower-level pages will then provide a solid linking foundation to "support" the changes on the higher-level page URLs and the home page.
After all URLs have been updated, you can replace the (probably numerous) individual and/or group redirect directives with a single site-wide index-to-slash redirect.
It is critical that these changes be implemented correctly. When creating 301 redirects, install and use the "Live HTTP Headers" add-on for Firefox/Mozilla to verify that your redirects return a proper 301-Moved Permanently redirect HTTP response header, and that any "error" in a requested URL is "corrected" with a single 301 redirect to the proper URL.
For example, if redirecting "www.example.com/xyz/index.html" to "www.example.com/xyz/", then a request for "example.com/xyz/index.html" (no 'www') should also be redirected to "www.example.com/xyz/" by the same single redirect -- correcting both the domain and the URL-path at the same time. Not only must your individual redirect directives be coded correctly, but they must be in the correct order to make this happen. Test, test, test, and then test again... :)
Jim
Your advice is very sound. I've already started to scour WW for the threads you mentioned. I think I will take a long term approach to this as my traffic/income is great right now and gathering all the info I need from WW threads will take time. Bottom up is a great idea, especially since I can track results in serps on some of those pages.
P.s. Any thoughts on what to do with my sitemap links? Should I try the same bottom to top fix on that too?
Thanks again for your help!