Forum Moderators: phranque

Message Too Old, No Replies

/page.html and /page

gives same page, no .htaccess

         

smallcompany

6:42 pm on Jan 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I just found that /page shows content of page.html on some of my sites (same host).

I deleted .htaccess to ensure if that is affecting it - no change.

It's VPS, so before I go (with my limited knowledge) and start playing with main config file, I wanted to ask about what could make site behave like this if it's not .htaccess?

Thanks

lammert

7:09 pm on Jan 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If there is no .htaccess, it may be caused by settings in the global httpd.conf. This kind of behavior is caused by content negotiation. You will probably find the line Options MultiViews somewhere in the httpd.conf.

smallcompany

10:34 pm on Jan 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just checked, and found two sections that had MultiViews in config file:

Options FollowSymLinks MultiViews Includes
AddHandler cgi-script .cgi .pl .py .sh

and

Alias /icons/ "/var/www/icons/"
<Directory "/var/www/icons">
Options Indexes MultiViews
AllowOverride None
Order allow,deny
Allow from all
</Directory>

I couldn't figure how would any of the two above affect /page to show content from /page.html

...but then... I saw this:

# DirectoryIndex: sets the file that Apache will serve if a directory
# is requested.
#
# The index.html.var file (a type-map) is used to deliver content-
# negotiated documents. The MultiViews Option can be used for the
# same purpose, but it is much slower.
#
DirectoryIndex index.html index.htm index.shtml index.php index.php4 index.php3 index.phtml index.cgi index.asp default.asp index.mv default.htm index.pl index.py index.cf

Ta naaaa... I thought this was it, but then figured this would be for folders only, exclusive to index file names.

So what is it then?

Finally, the only worry I do have about this is duplicated content.

I even have a redirect that says:

RewriteRule ^page$ http://www.example.com/page.html [R=301,L]

and yet I still end up with http://www.example.com/page if I enter it into the browser.

:s

encyclo

11:18 pm on Jan 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's this one:

Options FollowSymLinks [b][i]MultiViews[/i][/b] Includes

Remove the bold part :)

(See [httpd.apache.org...] for an explanation)

If you absolutely don't want to touch the main configuration file, you can always add

Options -MultiViews
(note the minus sign) in your root-level .htaccess. You do absolutely have to fix this, as MultiViews can cause havoc with duplicate content, and it interferes with the use of mod_rewrite.

smallcompany

1:08 am on Jan 17, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You do absolutely have to fix this

Taking that all you say is right - and I trust you, why the heck this host would do that?
And this is not any host, but one of those that has been mentioned here quite a bit as being used a lot.

encyclo

2:08 am on Jan 17, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



why the heck this host would do that?

There's nothing intrinsically wrong with the default setup by the hosting company - just that if you are interested in search engine ranking and you are not specifically using content negotiation then the supplied default is sub-optimal.

MultiViews / content negotiation is a cool tool and the Apache implementation is excellent. Its biggest weakness is that it does not cater to the search engines' ranking methods with regards to duplicate content as the same content is available via a multitude of URLs and, conversely, different content (eg. language negotiation) can appear under the same URL.

It can help reduce not-found errors and some sites inadvertently depend on it, so often you'll find default VPS setups with it turned on. It probably saves a few support calls to leave a bunch of Apache modules enabled by default and ready for use.

So, I don't blame the hosting company. You should really copy the httpd.conf file and read up on what everything means, and whether the defaults suit you. The Apache configuration is comprehensive, and well-written, and the config file usually has copious explanatory notes for each setting.

One of the advantages of a dedicated or virtual private server is that you can tweak Apache to get the best setup for your site, and disable modules which are not required.

smallcompany

3:06 am on Jan 17, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thank you, and thank you lammert for pointing to the right one.

I was confused with other Options entries as I did not even know that you can list all in one row.