Forum Moderators: phranque

Message Too Old, No Replies

Rewrite permanent 301 when URL has // double slash

Double Slash 301 error

         

simon_a6

4:12 pm on Dec 10, 2014 (gmt 0)

10+ Year Member



Hi. I'm new here, but not so new to PHP.
I am however a bit of a novice with htaccess.

We have a large number of Google Webmaster errors that generally have // in the shorturls. For one reason or another, it's cached them when the id numbers were missing.

Because we cannot also point someone to the right "next available page", how do I point someone with a // in the url, apart from in the Domain //www.... to a new url, like "/error"?

If it can be done generically, that would be fine, so any that have that in it, after the first //, would be good.

A URL like this is the problem:
http://www.example.com/product//LARGE-XL//SHIRT.
But there are example where there are three sets of // in the URL, from the subcat id, cat id and product ID.

[edited by: Ocean10000 at 5:19 pm (utc) on Dec 10, 2014]
[edit reason] examplfied [/edit]

not2easy

5:43 pm on Dec 10, 2014 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Hi simon_a6, welcome to the forums. Before looking at handling an unwanted result output URL, is there a way you could insert some identifying code when a field would be otherwise empty? That would let you generate an URL that would be easier to manage whether it was a generic "CAT51" when it is the "CAT" field that is empty or a unique identifier based on another field in the table such as "COLOR", "SKU" or such?

lucy24

7:34 pm on Dec 10, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



someone with a // in the url

I've just done some experimenting, and made an unexpected discovery which someone else will have to explain. (Disclaimer: It may not apply on all servers. But it isn't browser-specific.)

Multiple slashes, expressed as
(^|/)/

are not detected by the "pattern" of a RewriteRule. They can only be seen in a RewriteCond. So this
RewriteCond %{REQUEST_URI} (^|/)/

will work, but this
RewriteRule (^|/)/
etcetera
will not.

Try this on your own site with a made-up URL. If you find the same thing, it's bad news because it means your server will have to backtrack and evaluate conditions on every single page request. It's an exception to the general rule of "anything involving a positive URI match belongs in the body of the Rule".

Now, it can all be done in htaccess using a series of separate rules. For example

RewriteCond %{REQUEST_URI} (^|/)/
RewriteCond %{REQUEST_URI} ^/*([^/]+)/+([^/]+)/+([^/]+)/+([^/]+)$
RewriteRule (^|/|php)$ http://www.example.com/%1/%2/%3/%4 [R=301,L]

and then repeat the same rule for shorter paths. Replace
(^|/|php)$

with a pattern that will match all page requests on your own site, while excluding supporting files. You do not need to say anything about the query string.

But if you've got a lot of different possible patterns, it may be better to route everything to a php file which will perform any required substitutions and issue a redirect.

simon_a6

9:36 am on Dec 11, 2014 (gmt 0)

10+ Year Member



Hi All.
The issue is where a category ID or Product ID has been missed in a URL a href link, and Google's bot has seen it before I've had the chance to repair it.

It's now cached it. I wish there was a way to say to Google (this isn't even like that anymore, so ignore it). But to do that, I think you have to do a 301.

So:
1) can this be done URL by URL (ie. if any one of them can be pointed to their TRUE URL, then do it, or
2) can a generic rule be applied that if after the domain part, a // is found, then just route them to a /error page?

lucy24

10:52 am on Dec 11, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



1) Yes, if you like.
2) Yes, if you like.