Forum Moderators: phranque

Message Too Old, No Replies

Multiple 301s result of passing forward slashes

Request for URL goes 301 twice because I'm trying to track origin

         

mrjohncory

7:50 pm on Apr 28, 2016 (gmt 0)

10+ Year Member



My company just merged. I had two sites: Site1.tld and Site2.tld

Now I only have Site3.tld and want to move people there but use Google Analytics to monitor where they came from. Unfortunately sites 1 + 2 did not have the same structures, so I couldn't just do 1:1 address matches.

I tried to use RedirectMatch 301 and $ to do something like this on Site1, for example:

RedirectMatch 301 ^/$ http://site3.com/?utm_source=site_1\&utm_medium=redirect\&utm_term=/$1\&utm_campaign=merger_2016


Result:

http://site1.com/one/two/three/

http://site3.com/?utm_source=site_1&utm_medium=redirect&utm_term=%2Fone%2Ftwo%2Fthree%2F&utm_campaign=merger_2016

This makes me nervous.

The forward slashes pass along the first 301 but the receiving site 301s again to re-encode them as %2F. So, that's two 301s. Two 301s seems bad.

(Off-topic) Google doesn't like this "soft 404" stuff. Sorry, but sending people to a hard 404 for request of the two old sites is not OK with the CEO.

Am I right that this seems bad. Am I going about this all wrong?

Thank you!

not2easy

5:30 pm on Apr 30, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Just an idea - using mod_rewrite might work better than mod_alias for your 301s. I don't know if these are in combination with previous or subsequent rules that are using mod_rewrite but that might also be an issue.

Andy Langton

5:38 pm on Apr 30, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Google doesn't like this "soft 404" stuff. Sorry, but sending people to a hard 404 for request of the two old sites is not OK with the CEO.


There are two separate things - one is whether you redirect and then other is what users see. You can show them the homepage content but send a "hard" 404 header to Google. However, both this and your soft 404s are losing any value in the previous URLs. Is the content gone from the old sites?

wilderness

7:14 pm on Apr 30, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I tried to use RedirectMatch 301 and $ to do something like this on Site1, for example:

RedirectMatch 301 ^/$ http:// example3.com/?utm_source=site_1\&utm_medium=redirect\&utm_term=/$1\&utm_campaign=merger_2016


Does this mean that you actually have access to site1 and are placing the redirect with site1htaccess, which is where the redirect (using mod_rewrite) should be palced.?
It also appears that your escaping (backslash (not forward slash)) the ampersand and that is what is causing your encoding of the ampersand.

Result:
http:// example1.com/one/two/three/

http:// example3.com/?utm_source=site_1&utm_medium=redirect&utm_term=%2Fone%2Ftwo%2Fthree%2F&utm_campaign=merger_2016

This makes me nervous.


lucy will get here eventually, however her activity the past day or so has been little.
She's stressed over and again the difficulties mixing mod-alias with mod_rewrite, which not2easy also mentions. lucy answers these RedirectMatch questions over and over, in fact there's a similar thread just a few threads down titled 'RewriteRule and redirect not working in htaccess'.

If your using the site3 htaccess and attempting to do redirects for site1 & site2, than your procedures are in error! In that instance you'd require using the refer syntax.

lucy24

10:11 pm on Apr 30, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



however her activity the past day or so has been little

Well, it's not MY fault I slept all day
:: whine ::
Agreeing with everyone else: Yes, you need to get rid of that RedirectMatch business and translate everything to mod_rewrite syntax (RewriteRule). Redirects using mod_alias are fine if #1 you don't use mod_rewrite at all (this seems unlikely, since only mod_rewrite can do a domain-name-canonicalization redirect, which everyone should have), and #2 you never need to look at query strings, which only mod_rewrite can do, and #3 if you don't require the [NE] flag which may-- again, may-- be needed here.

NE means "no escape", i.e. send all special characters back to the requestor in their original form, without percent-encoding. For example, if you're redirecting to an in-page fragment link with # you have to use [NE] so the the # doesn't get changed to %23. (This is the example used in the Apache docs, and also happens to be the only situation where I've used the flag, but there are others.)

From your first post:
RedirectMatch 301 ^/$ http://example.com/?utm_source=site_1\&utm_medium=redirect\&utm_term=/$1\&utm_campaign=merger_2016
The form \& isn't needed. I assume your object is to escape literal ampersands-- but nothing in the target needs to be escaped anyway. (If escaping were necessary, you would also need to escape the ? mark.)

:: detour to rarely visited mod_alias docs to ensure I'm not talking through my hat ::

The $1 means "use something you captured from the pattern", except there was no capture-- in fact there was nothing to capture, since the rule is written to apply only to requests for the root. This makes me suspect that the rule was cut-and-pasted from some other source, and we need to figure out what you're really trying to do. Is the object to store information about the originally requested URL, whatever it was? If so, this specific rule can simply omit the "$1" since its content will be null anyway.

Whether in RedirectMatch or RewriteRule, there's never a reason to capture-and-reuse if the content of the capture will always be the same. Just write out whatever the literal text is.

utm_term=%2Fone%2Ftwo%2Fthree%2F

Where does this come from? The rule is obviously plugging something into that "$1" but what? Does the original request have "/one/two/three/" and you're now looking at the result of some other redirect from a different rule applying to a different original URL? This is confusing.

Do you really need all those separate utm_ elements in the query string? Is it built into GA? Otherwise I'd think you could compress everything into a single term, like "tracker=2016_$1" where $1 is the originally requested URL. And then only if, in fact, something was captured. Otherwise just "tracker=2016_/"

Incidentally, you could replace
^/$
with
^/(index\.htm|$)
to kill two birds with one stone. (Note position of $ closing anchor.) And the same thing with any request for a directory, mutatis mutandis. Omit the leading / in mod_rewrite unless the rule is lying loose in the config file, which I don't think is the case.

wilderness

10:29 pm on Apr 30, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Didn't believe SuperGirl required sleep ;)

Many thanks.