Forum Moderators: open
For example, instead of this URL: [mydomain.com...]
you would use this URL: [mydomain.com...]
You keep the file extension (such as .html) on the file (eg about.html), but refer to the Web resource without it.
A couple of weeks ago, I have set up Apache to do content negotiation (via .htaccess). Everything works fine, I can access my pages with or without filename extension.
Now there seem to be two problems:
1. In my logfiles, there is no trace of the Googlebot, so I wonder if Google will index this type of pages.
2. Is it considered duplicate content, when people refer to a page with or without filename extension?
(to me it's similar as using [mydomain.com...] or [mydomain.com...]
it's possible to view your files both with and without the (html) extension?
Yes, no problems.
also, the people already linking to you - are they using urls with extensions?
Till now they use URLs with extensions, because I changed this only a few weeks ago and my site navigation still shows urls with extensions.
To test this feature, I submitted only one particular page to Google in order to see whether Googlebot would index it. If Google won't index this page, it's useless to proceed with this content negotiation thing...
are you sure you need to have your URLs without an extension?
the article you mention also says how important is not to change your urls either.
if you are already ranked for .html and have incoming links which go to .html pages then why bother taking the extension off?
i've been through exactly that situation ;-) and with scripting languages and .htaccess you can run any language (html, php, asp, etc) with any extension. so if you go dynamic you can keep the same urls.
just have a good think about if you really need the urls to change :-)
p.s. i kept the old urls and boy did that save me stress with SE-stuff (asking for link updates, resubmitting, 304 redirects, etc ;-)
<edit> spelling </edit>
From the article:
"It the the duty of a Webmaster to allocate URIs which you will be able to stand by in 2 years, in 20 years, in 200 years. This needs thought, and organization, and commitment."
Right! Sounds great! Even the grammar errors are cool!
While the article has some great ideas, it assumes a great deal and I found it to be like saying:
"Why should you move to a different State/Country? It just shows you don't know where you want to live."
or
"You want to change careers? Too bad, you should have thought of that in the womb!"
In theory, the article is great. In reality and in REAL LIFE...it doesn't work that way.
(Yes- the article ticked me off. Nothing personal, it just rubbed me the wrong way.)
Your incoming links are another issue. If you have many deep incoming links and two ways of reaching the same page, it's just like having two domain names when it comes to linking; this might get you a double entry in Google (which is a duplicate) and the PR of that page will be different for the ".html" and "non-html" version, as you will have a different set of links pointing to each.
It might be an idea making a 301 redirect from ".html" to "non-html" if you want to be sure that it's considered as just one site - and to get the duplicates removed from the SE's. Or the other way round, that's up to you - consistency is more important than which way the redirect goes, afaik.
/claus