Forum Moderators: phranque

Message Too Old, No Replies

Drop everything after ? from directory index

adwords autotagging problem

         

Mike521

4:01 pm on Feb 13, 2008 (gmt 0)

10+ Year Member



My .htaccess file has a DirectoryIndex that actually points to a page generated by our shopping cart, Miva Merchant. The line reads:

DirectoryIndex /mm5/merchant.mvc?Screen=CTGY&Category_Code=homepage index.php index.html

I just realized today that it fails when a query string is tacked onto the URL, such as:

/?gclid=bunch-of-random-stuff

it probably expands to:

/mm5/merchant.mvc?Screen=CTGY&Category_Code=homepage?gclid=bunch-of-random-stuff

and the second question mark kills it. We end up with a header, navigation, and footer, but no internal content.

So I need to drop everything after that, or at least hide it from the system somehow. Is it possible to do?

Ideally, the gclid data would remain so that the javascript picks it up (it has been so far, despite the double question mark problem), but the internal miva system wouldn't know about it.

Alternatively we could redirect them to the expanded URL and tack the query string on the end? I'd only want to do this if they came in with a query string, anyone else should just get .com/ with nothing else, no redirection.

Last ditch options are to drop the query string completely, or to switch to a static homepage. Right now we've switched to a static homepage.

Last note -- we never know what gclid will be, it's random stuff generated by adwords

thanks in advance all

jdMorgan

5:47 pm on Feb 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You can fix this generally, by replacing the second (illegal) occurrence of "?" with "&".

Use RewriteCond in mod_rewrite to examine the query string and detect that second question mark (possibly encoded as %3F) and then use RewriteRule to substitute an ampersand in all appropriate URLs.

Jim

Mike521

5:53 pm on Feb 13, 2008 (gmt 0)

10+ Year Member



thanks JD, can you help me out a bit more, unfortunately I'm terrible with rewrite rules.

maybe you know of a good tutorial for beginners? a lot of the stuff I've read in the past has left me hanging, not enough examples or explanations, etc..

Mike521

4:08 pm on Feb 14, 2008 (gmt 0)

10+ Year Member



I read through the regex tutorial linked from the charter here [etext.lib.virginia.edu...]

can you tell me if I'm heading in the right direction:

RewriteCond ^\?Screen=$1\&Category_Code=$2(.*)\?$3
RewriteRule ^(.*) [thesite.com...]

g1smd

1:17 am on Feb 15, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What are you testing in the rewrite cond?

%{QUERY_STRING} or %{THE_REQUEST} or what?

jdMorgan

2:04 am on Feb 15, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You're on the right track. Probably something more like this:

RewriteCond %{QUERY_STRING} ^([^?]*)\?(.*)$
RewriteRule ^merchant\.mvc$ http://www.example.com/mm5/merchant.mvc?%1&%2 [R=301,L]

That is, if the requested URL is merchant.mvc in the current directory, and the query string contains a "?", replace it with an ampersand.

The [^?]* pattern means, "Match any number of characters that are not a question mark" -- or equivalently, "Match everything up to the next question mark."

Note that the "?" is part of neither the URL nor its attached query string. It is a delimiter between the two. So normally, %{QUERY_STRING} will never contain a question mark.

Jim

Mike521

2:54 pm on Feb 15, 2008 (gmt 0)

10+ Year Member



thanks g1 and jd --

I tried it, but something's not quite right, can you tell me if I understand the lines properly:

%{QUERY_STRING} -- means we're checking the query string
^ -- means match anything that came previously
([^?]*) -- match anything that isn't a "?", for as long as it takes
\? -- match a "?"
(.*)$ -- match anything that comes after the "?", and store it in a variable

line two:
^merchant\.mvc$ -- I'm not sure what this does..

http://www.example.com/mm5/merchant.mvc -- redirect to this URL
?%1&%2 -- where is 1 coming from? and shouldn't these be $ instead of % signs? I changed it to $ and it didn't work though..
[R=301,L] -- redirect code is 301 (permanent), what about the L, what does that mean?

jdMorgan

3:25 pm on Feb 15, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"^" at the beginning of a pattern means "match only strings starting with" the following characters.
^merchant\.mvc$ is a pattern intended to match all requested URLs to which this rule should be applied. The rewriterule pattern should be as restrictive as possible, so that for example, the rule won't be applied to robots.txt and logo.gif, which would be a waste of CPU time.

The pattern in the RewriteRule depends on the location of this code. The pattern should match the entire requested URL-path with the exception of the path to the current directory in which this code resides. This is because the URL-paths 'seen' by RewriteRule in .htaccess are stripped of the path to the current directory.

The use of %1 and %2 was quite intentional. See the Apache mod_rewrite documentation, particularly the discussions of "back-references". The the documents cited in our forum charter [webmasterworld.com] and the tutorials in the Apache forum section of the WebmasterWorld library [webmasterworld.com] may also prove useful.

Jim

[edited by: jdMorgan at 3:26 pm (utc) on Feb. 15, 2008]