Forum Moderators: phranque

Message Too Old, No Replies

Redirecting IP Address

Duplicate content issue

         

jinxed

9:54 pm on Feb 3, 2011 (gmt 0)

10+ Year Member



I think I may have discovered a potentially damaging issue with one of my websites where I have noticed a number of pages have been indexed for the sites IP address as well as its domain name – i.e. duplicate content.

To remedy this, I have added the following code into the top level .htaccess file;

Options +FollowSymLinks
RewriteEngine on

RewriteCond %{HTTP_HOST} ^example\.co\.uk [NC,OR]
RewriteCond %{HTTP_HOST} ^123\.45\.678\.90
RewriteRule (.*) http://www.example.co.uk/$1 [R=301,L]


The result of this has meant when going to:

http://123.45.678.90/folder-one/folder-two/key-word.html


I am being directed to:

http://www.example.co.uk/folder-one/show_drink.php?url_filename=key-word


Instead of the desired:

http://www.example.co.uk/folder-one/folder-two/key-word.html


So, the code is being read and the address no longer resolves to the IP address, but the new issue is that the code to turn the dynamic URL to the static URL is no longer being used.

Am I missing something simple here? There is a separate htaccess file in /folder-two/ which rewrites the URL to the static alternative i.e. http://www.example.co.uk/folder-one/folder-two/key-word.html

Any guidance would be very much appreciated as I think this could turn into a search engine nightmare.

jinxed

12:37 pm on Feb 5, 2011 (gmt 0)

10+ Year Member



Update:

I think the code above works fine and the problem is with the code in the htaccess located within /folder-two/

g1smd

8:51 am on Feb 6, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The problem is likely that the redirect is being invoked after the internal rewrite. That will always expose the internal filepath.

The solution is to list the redirect code before the rewrite code and to make sure that all redirects and rewrites are coded using RewriteRule syntax (not Redirect or RedirectMatch).

Another part of the problem may be "where" the code has been placed. It might be in the wrong folder. It might need to be moved to a higher level folder.

jinxed

5:27 pm on Feb 8, 2011 (gmt 0)

10+ Year Member



Thanks g1smd

Unfortunately this issue is totally different from what I first thought.

Originally I thought that this new code for the IP redirection was the reason for the parameters being exposed, but it’s now the case that my 301 redirect rules in all my folders (except the root htaccess file) are not working – hence the problem.

I have setup a temporary fix for this by adding php code for 301 header redirects (if the code detects that the page is the dynamic url) while I figure out the correct code for the htaccess files in these folders. Would this approach be a problem? Could this code just be left in?

wilderness

6:04 pm on Feb 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Pardon my intrusion.

Here's one issue:

RewriteCond %{HTTP_HOST} ^123\.45\.678\.90


An IP rnage is NOT a host, rather an address.
Should read:

RewriteCond %{REMOTE_ADDR} ^123\.45\.678\.90

g1smd

6:18 pm on Feb 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes, HTTP_HOST will contain an IP address and not the domain name if the IP of the server is what the user requested.

The correct remedy is indeed to redirect such requests to the canonical domain name detecting the value of HTTP_HOST to do so.

jinxed

6:55 pm on Feb 8, 2011 (gmt 0)

10+ Year Member



All comments welcome :)

Unfortunately, substituting the HTTP_HOST code with REMOTE_ADDR no longer redirects the IP to the domain name. Any obvious reasons why?

The new code was:

RewriteCond %{HTTP_HOST} ^example\.co\.uk [NC,OR]
RewriteCond %{ REMOTE_ADDR } ^123\.45\.67\.89
RewriteRule (.*) http://www.example.co.uk/$1 [R=301,L]


Also, would temporarily using the php header code instead of the htaccess file be a problem? (A check with HTTP Live Headers confirmed the correct 301 status code is being produced).
Again, thank you for the advice.

g1smd

8:36 pm on Feb 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



As I explained immediately above, when testing the IP address HTTP_HOST is what you need, not REMOTE_ADDR.

jdMorgan

6:41 pm on Feb 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You might find it far simpler to redirect *all* non-blank non-canonical hostname requests to the canonical hostname:

# Externally redirect all non-blank non-canonical hostname requests to canonical host
RewriteCond ${HTTP_HOST} !^(www\.example\.co\.uk)?$
RewriteRule ^(.*)$ http://www.example.co.uk/$1 [R=301,L]

Also, do NOT feel free to add spaces "just anywhere" in your code. Unless required by the documented syntax, spaces will cause server errors. mod_rewrite is not at all "free-form code" -- You must use the syntax exactly as specified in the documentation.

Jim