Forum Moderators: phranque

Message Too Old, No Replies

.htaccess duplicate content redirect problem

         

lobas

2:06 pm on Feb 12, 2011 (gmt 0)

10+ Year Member



Hi,

Google is now indexing pages on my site causing duplicate content issues because the .htaccess is not redirecting properly.

Google caches the following Urls

http://www.somesite.co.uk/?http://www.somesite.co.uk/


Options +FollowSymlinks
RewriteEngine on



# Externally redirect direct client requests for script filepaths back to "friendly" extensionless URLs
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]*/)*[^.]+\.php([?#][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]*/)*[^.]+)\.php$ /$1/ [R=301,L]

# Externally redirect to add missing trailing slashes to extensionless URL requests
RewriteCond $1 !\.[a-z]+[0-9]?$
RewriteRule ^(.*[^/])$ http://www.somesite.co.uk/$1/ [R=301,L]

# Externally redirect all non-canonical hostname requests to canonical hostname
RewriteCond %{HTTP_HOST} !^www\.somesite\.co\.uk$
RewriteRule ^(.*)$ http://www.somesite.co.uk/$1 [R=301,L]

# Internally rewrite all requests ending with slash to php scripts
RewriteRule ^(.*)/$ /$1.php [L]

g1smd

3:37 pm on Feb 12, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You need to include the hostname in the redirect target for the first ruleset otherwise you will get an unwanted multi-step redirection chain for some requests.

I don't see any provision in your current ruleset to handle the URLs that you say are causing problems.

You'll either need another rule in the .htaccess to handle this, or it should be handled by your PHP script already returning 404 status for this non-valid request.

lobas

12:47 pm on Feb 13, 2011 (gmt 0)

10+ Year Member



Hi thanks for replying, so how would i do that like below?


# Externally redirect direct client requests for script filepaths back to "friendly" extensionless URLs
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]*/)*[^.]+\.php([?#][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]*/)*[^.]+)\.php$ http://www.somesite.co.uk/$1/ [R=301,L]

lobas

1:55 pm on Feb 14, 2011 (gmt 0)

10+ Year Member



anybody help? pllzz

lobas

8:46 pm on Feb 15, 2011 (gmt 0)

10+ Year Member



i have also tried RewriteRule ^(.*)/\?$ [somesite.co.uk...] [R=301,L]
with no luck

g1smd

10:42 pm on Feb 15, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That rule can never work, because the path part of the URL request can never "contain" a question mark.

The RewriteRule pattern can match ONLY the path part of the request.

lobas

10:55 pm on Feb 15, 2011 (gmt 0)

10+ Year Member



Right so its impossible with htaccess? ive tried

Redirect 301 http://www.somesite.co.uk/?http://www.somesite.co.uk http://www.somesite.co.uk

and it does not work

Can you show me an example please im useless with htaccess

jdMorgan

3:11 am on Feb 18, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




# Externally redirect requests for the URL-path
# /<anything-or-nothing>?http://www.somesite.co.uk<something_or_blank> to
# http://www/somesite.co.uk/<anything-or-nothing>
RewriteCond %{QUERY_STRING} ^http://www\.somesite\.co\.uk
RewriteRule ^(.*)$ http://www.somesite.co.uk/$1? [R=301,L]

Note that there is nothing in the rules that you posted above that would cause google to start indexing URLs like "http://www.somesite.co.uk/?http://www.somesite.co.uk/"

Your rules don't do that, so look for that problem elsewhere.

Jim