homepage Welcome to WebmasterWorld Guest from 54.166.122.65
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

This 41 message thread spans 2 pages: 41 ( [1] 2 > >     
Simple 301 redirect and Wordpress htaccess rules
agneslesage



 
Msg#: 4182512 posted 3:04 pm on Aug 5, 2010 (gmt 0)

Hi

I am moving a site that was in ASP into Wordpress, whose friendly URL I truly enjoy :-)

Though, I have troubles to get my 301 redirects work...

I ve checked useful help on post
[webmasterworld.com...]
And tried to work and test around different things, though, without success. So I hope jdMorgan reads this as he seems to be the master :)

My Wordpress site has this isntructions in .htaccess, and if I remove them the site does not work (except the homepage), I guess it is used to handle the friendly URL:

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress


Now, I am trying to add a list of 301 redirects like:

Redirect 301 /template.asp?page=qui_nous&titre=qui_nous http://www.mysite.com/qui-sommes-nous/

I've tried put it before, inside, somewhere into the exwisting WP code... the instruction just seems to be ignored.

Any magic idea?

Thanks
Agnes

 

agneslesage



 
Msg#: 4182512 posted 3:08 pm on Aug 5, 2010 (gmt 0)

NB:when I say ignored: I only get a 404 error page, not the page I redirect to

agneslesage



 
Msg#: 4182512 posted 3:23 pm on Aug 5, 2010 (gmt 0)

One more thing, as I see on other posts it seems to matter:

My web is hosted at 1and1 in a user directory
[mydir.onlinehome.fr...]

I have www.mysite.com as an external domain directed to that www-mysite-com directory, and Wordpress is directly in there.
The .htaccess file I am talking about is at the root of the wordpress site / web site.

Go60Guy

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4182512 posted 3:37 pm on Aug 5, 2010 (gmt 0)

I'm not as savvy as I'd like to be on doing redirects using .htaccess, but, FWIW, there's a free redirection plugin for WordPress. Just do a search on Google for "wordpress redirection plugin" and you'll find it. Hope this helps.

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4182512 posted 3:41 pm on Aug 5, 2010 (gmt 0)

Don't mix "Redirect" or "RedirectMatch" directives from mod_alias with RewriteRule directives from mod_rewrite if you want to control the order of execution. Directives are processed by each Apache module in turn, and not strictly in the order that they appear in your .htaccess code.

Without testing, there is no way to know whether mod_alias will process your .htaccess file first, or whether mod_rewrite will process it first. There is also no guarantee that this execution order won't change if your server is upgraded, or if the host modifies your server configuration.

I suggest making the following modifications to your code for noticeably-improved speed and correct function:

RewriteEngine On
RewriteBase /
#
# Redirect specific .asp URLs to WP+SEF-format URLs
RewriteCond %{QUERY_STRING} ^page=qui_nous&titre=qui_nous$
RewriteRule ^template\.asp$ http://www.example.com/nous-sommes-la-meme/? [R=301,L]
#
# Redirect direct client requests for URL-path /index.php to / to avoid duplicate content
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /index\.php([?#][^\ ]*)?\ HTTP/
RewriteRule ^index\.php$ http://www.example.com/ [R=301,L]
#
# Redirect requests for non-blank, non-canonical hostnames to canonical hostname
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
#
# BEGIN WordPress
# Except for requests for /index.php and for the most-frequently-requested
# filetypes that WP cannot generate, rewrite all URL requests which do not
# resolve to an existing file or directory to the WordPress script filepath
RewriteCond $1 !^index\.php$
RewriteCond $1 !\.(gif|jpe?g|png|ico|css|js)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)$ /index.php [L]
# END WordPress

The query strings appended to URLs are not 'visible' to the Redirect or RewriteRule directives, and must be tested with a RewriteCond as shown.

For best performance, it is best to avoid executing the "file exists" checks in the WP rule if in fact the URL indicates a request for a filetype that WP cannot create or generate. It is also good to avoid these checks if the path has already been rewritten to /index.php (which was part of the original code, but slightly modified here).

The 'list' of filetypes in the second RewriteCond need not be complete. The purpose is only to avoid executing the very-slow file- and directory-exists checks in the third and fourth RewriteConds unless it is really necessary. It is important only to avoid these checks for the *majority* of object requests that cannot be handled by WP.

You could add more filetypes, such as pdf, doc, mp3, swf, flv, wmv, avi, mov, mpeg, doc, xml, etc. But again, it is only important to avoid the exists-checks for *most* requests to the server that cannot be handled by WP. As most of these requests will be for .gif and .jpg (or .jpeg) image files on a typical Web site, those two filetypes should almost always go first.

I deleted the <IfModule> container, since its only purpose is to make this code fail silently if mod_rewrite is not available. In most cases, the Webmaster would actually *want* an error message to be logged if mod_rewrite wasn't working...

I added the index and hostname canonicalization rules to demonstrate the correct order of rules (which is very important).

Jim

[edit] Corrected as noted below. [/edit]

[edited by: jdMorgan at 5:59 pm (utc) on Aug 6, 2010]

agneslesage



 
Msg#: 4182512 posted 8:30 am on Aug 6, 2010 (gmt 0)

Waouh Thanks Jim

I see the specific command for my specific redirect is:

# Redirect specific .asp URLs to WP+SEF-format URLs
RewriteCond %{QUERY_STRING} ^page=qui_nous&titre=qui_nous$
RewriteRule ^template\.asp$ http://www.example.com/nous-sommes-la-meme/ [R=301,L]

I will have to replicate it for some 40 of them, o I'll try to make sure before...

Now, I see you put sthg special for the ? conditions
RewriteCond %{QUERY_STRING} ^page=qui_nous&titre=qui_nous$

But do you think this will work if some Google tags are added behind?
Like there may be either Google Analytics tags: &utm_media=xyv
Or Google Adqwords tags: &gclid=zuv
Those last tags are generated by Google and I can't preview them all!

Thanke again for your expertise
I am going to try this.

Agnes

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4182512 posted 9:47 am on Aug 6, 2010 (gmt 0)

Remove the $ end-anchor symbol from the RewriteCond pattern if other parameters may be appended to the request.

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4182512 posted 3:48 pm on Aug 6, 2010 (gmt 0)

This is a basic regular-expressions problem:

Query string starts with "page=qui_nous" : ^page=qui_nous
Query string ends with "page=qui_nous" : page=qui_nous$
Query string is exactly "page=qui_nous" : ^page=qui_nous$
Query string contains name/value pair "page=qui_nous" : ^([^&]*&)*page=qui_nous(&.*)?$

Jim

agneslesage



 
Msg#: 4182512 posted 4:28 pm on Aug 6, 2010 (gmt 0)

Hi

This forum is magic... great people!

Now, I ve put the code with "query string contains" option.
And it seems to work:my 301 redirect instruction is no more ignored), but for some reason the original string is appended as it redirects to:
[atable.com...]

Which... may not be really a problem, though I'd rather remove it.

On the other hand, if there is some other parameters from Google Analytics and Adwords, I am rather happy to keep the string...

So I wonder if there is a way to remove that original URL string (that can be "page=qui_nous" or "page=qui_nous&titre=qui_nous"), and keep any other parameter, I'd be 110% satisfied.

If not, I am still very happy with the above solution.
Thanks!
Agnes

##### New code ####
#
RewriteEngine On
RewriteBase /
#
# Redirect specific .asp URLs to WP+SEF-format URLs
RewriteCond %{QUERY_STRING} ^([^&]*&)*page=qui_nous(&.*)?$
RewriteRule ^template\.asp$ http://www.atable.com/qui-sommes-nous/ [R=301,L]
#
# Redirect direct client requests for URL-path /index.php to / to avoid duplicate content
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /index\.php([?#][^\ ]*)?\ HTTP/
RewriteRule ^index\.php$ http://www.atable.com/ [R=301,L]
#
# Redirect requests for non-blank, non-canonical hostnames to canonical hostname
RewriteCond %{HTTP_HOST} !^(www\.atable\.com)?$
RewriteRule ^(.*)$ http://www.atable.com/$1 [R=301,L]
#
# BEGIN WordPress
# Except for requests for /index.php and for the most-frequently-requested
# filetypes that WP cannot generate, rewrite all URL requests which do not
# resolve to an existing file or directory to the WordPress script filepath
RewriteCond $1 !^index\.php$
RewriteCond $1 !\.(gif|jpe?g|png|ico|css|js)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)$ /index.php [L]
# END WordPress

[edited by: jdMorgan at 5:58 pm (utc) on Aug 6, 2010]
[edit reason] Added code tags for formatting. [/edit]

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4182512 posted 5:49 pm on Aug 6, 2010 (gmt 0)

Add a question mark to the end of the target URL of the redirect to clear the query string.

Use [ code ] ... [ / code ] tags around your example code to stop smilies and URL auto-linking.

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4182512 posted 6:01 pm on Aug 6, 2010 (gmt 0)

Sorry, I always forget that question mark operator... :(
I corrected the code in my post above to prevent further propagation of that error.

Jim

agneslesage



 
Msg#: 4182512 posted 9:47 am on Aug 9, 2010 (gmt 0)

I feel stupid to ask but... how do I write simple redirect for my homepage...?

<code>
http://www.example.com/default.asp to http://www.example.com </code>

May I just go:

<code>
# Redirect default.asp URLs to WP+SEF-format URLs
RewriteRule ^default\.asp$ http://www.example.com/ R=301,L]
</code>

agneslesage



 
Msg#: 4182512 posted 9:50 am on Aug 9, 2010 (gmt 0)

ouh... that seems top make a 500 server error... glurps

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4182512 posted 10:57 am on Aug 9, 2010 (gmt 0)

There's a [ missing in your code.

Additionally, the forum uses [ code ] not < code > tags.

agneslesage



 
Msg#: 4182512 posted 1:13 pm on Aug 9, 2010 (gmt 0)

OK guys
You've helped me so much and efficiently.

Just for the record, I'll put the code (if someone ever searches for similar, rather get it right)



###### PHP compatibility ######

AddType x-mapp-php5 .php

###### WordPress original code #####

# BEGIN WordPress
# <IfModule mod_rewrite.c>
# RewriteEngine On
# RewriteBase /
# RewriteRule ^index\.php$ - [L]
# RewriteCond %{REQUEST_FILENAME} !-f
# RewriteCond %{REQUEST_FILENAME} !-d
# RewriteRule . /index.php [L]
# </IfModule>
# END WordPress

##### Wordpress Optimized code with my redirects ####

# Intro

RewriteEngine On
RewriteBase /

# My specific Redirect default.asp to WP Homepage
RewriteRule ^default\.asp$ http://www.example.com/ [R=301,L]

# My Specific Redirects from old site (.asp URLs containing ?string to WP+SEF-format URLs)
# Redirects template.asp?page=mystring containging "?page=mystring" and any other parameter to specified new URL www.example.com/mynewurl
# One may remove the "?" at the end of the target URL to keep parameters in the string.

RewriteCond %{QUERY_STRING} ^([^&]*&)*page=mystring(&.*)?$
RewriteRule ^template\.asp$ http://www.example.com/mynewurl/? [R=301,L]

# WP optimized code

# Redirect direct client requests for URL-path /index.php to / to avoid duplicate content
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /index\.php([?#][^\ ]*)?\ HTTP/
RewriteRule ^index\.php$ http://www.example.com/ [R=301,L]

# Redirect requests for non-blank, non-canonical hostnames to canonical hostname
RewriteCond %{HTTP_HOST} !^(www\.atable\.com)?$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

# Except for requests for /index.php and for the most-frequently-requested
# filetypes that WP cannot generate, rewrite all URL requests which do not
# resolve to an existing file or directory to the WordPress script filepath
RewriteCond $1 !^index\.php$
RewriteCond $1 !\.(gif|jpe?g|png|ico|css|js)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)$ /index.php [L]
# END WordPress

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4182512 posted 1:50 pm on Aug 9, 2010 (gmt 0)

I'd also add a question mark to the end of the RewriteRule target in the very first rule. This then prevents parameters attached to default.asp URL requests being redirected to the root, and stops it retaining the parameter in the new URL.

This also solves the problem that such a request would result in an unwanted redirection chain, a double redirect, as the next rule would have had to have been invoked to remove the parameters.

In fact, I would add a question mark to all of the redirect rules to be sure there is no way for parameters to carry through on any request - just in case the order of the rules is changed at some point in the future.

One problem still to solve: requests for index.php with attached parameters still produce a double redirect.

Rule ordering and scope is very important here.

Personally, I'd exclude both index and default from having their parameters stripped within the current parameter-stripping rule (or I would place it after the rule(s) dealing with default and index requests) as that avoids a double redirect for those requests.

I would modify the index redirect to handle both index and default requests, and make sure that rule also forces www and remove parameters for those requests, at the same time as stripping the index or default filename from the URL.

The non-www to www redirect would be the last of the redirects, as ever.


# My Specific Redirects from old site (.asp URLs containing ?string to
# WP+SEF-format URLs). Redirects template.asp?page=mystring
# containing "?page=mystring" and any other parameter to specified new
# URL www.example.com/mynewurl One may remove the "?" at the end of the
# target URL to keep parameters in the string.
RewriteCond %{QUERY_STRING} ^([^&]*&)*page=mystring(&.*)?$
RewriteRule ^template\.asp$ http://www.example.com/mynewurl/? [R=301,L]

# WP optimized code

# Redirect direct client requests for URL-path /index.php to / and
# for /folder/index.php to /folder/ Likewise for default.php direct
# requests in root or in folder, and force www and strip parameters
# at same time, to avoid duplicate content.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*(index|default)\.(php|html?|asp)([?#][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*)(index|default)\.(php|html?|asp)$ http://www.example.com/$1? [R=301,L]

# I'd add an additional redirect here to strip all parameters and force www
# for all requests with parameters that have made it thus far.

# Redirect requests for non-blank, non-canonical hostnames to canonical hostname
RewriteCond %{HTTP_HOST} !^(www\.atable\.com)?$
RewriteRule ^(.*)$ http://www.example.com/$1? [R=301,L]

# Except for requests for /index.php and for the most-frequently-requested
# filetypes that WP cannot generate, rewrite all URL requests which do not
# resolve to an existing file or directory to the WordPress script filepath
RewriteCond $1 !^index\.php$
RewriteCond $1 !\.(gif|jpe?g|png|ico|css|js)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)$ /index.php [L]
# END WordPress

agneslesage



 
Msg#: 4182512 posted 2:44 pm on Aug 9, 2010 (gmt 0)

ouf....
I am not sure I got it all right...

1> I ve added a question mark at the end of the first redirect for the default to index

RewriteRule ^default\.asp$ http://www.atable.com/? [R=301,L]


2> I ve changed the index and default to what you said

# Redirect direct client requests for URL-path /index.php to / and for /folder/index.php to /folder/
# Likewise for default.php direct requests in root or in folder,
# and force www and strip parameters at same time, to avoid duplicate content.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*(index|default)\.(php|html?|asp)([?#][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*)(index|default)\.(php|html?|asp)$ http://www.example.com/$1? [R=301,L]


3> Now for this of yours, I dont know if I am supposed to do more...

"One problem still to solve: requests for index.php with attached parameters still produce a double redirect.
Rule ordering and scope is very important here.
Personally, I'd exclude both index and default from having their parameters stripped within the current parameter-stripping rule (or I would place it after the rule(s) dealing with default and index requests) as that avoids a double redirect for those requests.
I would modify the index redirect to handle both index and default requests, and make sure that rule also forces www and remove parameters for those requests, at the same time as stripping the index or default filename from the URL.
"

=> So the final code comes:

##### Wordpress Optimized code with my redirects ####

# Intro

RewriteEngine On
RewriteBase /

# My specific Redirect default.asp to WP Homepage
RewriteRule ^default\.asp$ http://www.atable.com/? [R=301,L]

# My Specific Redirects from old site (.asp URLs to WP+SEF-format URLs)
# One may remove the "?" at the end of the target URL to keep parameters in the string.

RewriteCond %{QUERY_STRING} ^([^&]*&)*page=qui_nous(&.*)?$
RewriteRule ^template\.asp$ http://www.atable.com/qui-sommes-nous/? [R=301,L]

# WP optimized code #

# Redirect direct client requests for URL-path /index.php to / and for /folder/index.php to /folder/
# Likewise for default.php direct requests in root or in folder,
# and force www and strip parameters at same time, to avoid duplicate content.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*(index|default)\.(php|html?|asp)([?#][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*)(index|default)\.(php|html?|asp)$ http://www.example.com/$1? [R=301,L]


# I'd add an additional redirect here to strip all parameters and force www
# for all requests with parameters that have made it thus far.

# Redirect requests for non-blank, non-canonical hostnames to canonical hostname
RewriteCond %{HTTP_HOST} !^(www\.atable\.com)?$
RewriteRule ^(.*)$ http://www.atable.com/$1? [R=301,L]

# Except for requests for /index.php and for the most-frequently-requested
# filetypes that WP cannot generate, rewrite all URL requests which do not
# resolve to an existing file or directory to the WordPress script filepath
RewriteCond $1 !^index\.php$
RewriteCond $1 !\.(gif|jpe?g|png|ico|css|js)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)$ /index.php [L]
# END WordPress

agneslesage



 
Msg#: 4182512 posted 2:46 pm on Aug 9, 2010 (gmt 0)

PLus also I wonder

When you do


# I'd add an additional redirect here to strip all parameters and force www
# for all requests with parameters that have made it thus far.

# Redirect requests for non-blank, non-canonical hostnames to canonical hostname
RewriteCond %{HTTP_HOST} !^(www\.atable\.com)?$
RewriteRule ^(.*)$ http://www.atable.com/$1? [R=301,L]



Are we not removing the Google Analytics and Adwords parameters (which I d like to keep at least for those URL that are not from the old site)?

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4182512 posted 3:42 pm on Aug 10, 2010 (gmt 0)

Not clear... This removes parameters on incoming requests to your server. It does not affect links on your pages which reference Analytics or Adwords tracking parameters.

If you need to internally track incoming URL requests with appended tracking parameters, then exclude those URL-paths or specific parameters from the "remove query" rules using a negative-match RewriteCond.

Jim

agneslesage



 
Msg#: 4182512 posted 10:27 am on Aug 11, 2010 (gmt 0)

Hi

Yes, I need to keep Adwords ?glcid parameters as well as some Analytics ?utm_sthgs parameters on most URL
Using a RewriteCond negative I dont know...
But I dont really care if other parameters are kept as well, it does not matter.

On the other hand, I wrote:

# Redirect direct client requests for URL-path /index.php to / and for /folder/index.php to /folder/
# Likewise for default.php direct requests in root or in folder,
# and force www and strip parameters at same time, to avoid duplicate content.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*(index|default)\.(php|html?|asp)([?#][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*)(index|default)\.(php|html?|asp)$ http://www.atable.com/$1? [R=301,L]


And still, when I type:

http://www.atable.com/traiteur/coffrets-repas/index.php?utm_source=test

The page comes up without rewriting...
Agnes

agneslesage



 
Msg#: 4182512 posted 10:37 am on Aug 11, 2010 (gmt 0)

Another question

I ve found some netlinking goes to a wrong URL
http://www.atable.com/atable_traiteur/

How would I redirect from this wrong directory?

# Another specific Redirect to WP Homepage
RewriteRule ^a_table_traiteur$ http://www.atable.com/? [R=301,L]

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4182512 posted 2:10 pm on Aug 11, 2010 (gmt 0)


# Redirect direct client requests for URL-path /<anyfolder>/<index or default>.<php, html, htm, or asp>
# to /<anyfolder>/ and force www and strip the query string to avoid duplicate content.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*(index|default)\.(php|html?|asp)([?#][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*)(index|default)\.(php|html?|asp)$ http://www.atable.com/$1? [R=301,L]

And still, when I type:

http://www.atable.com/traiteur/coffrets-repas/index.php?utm_source=test

The page comes up without rewriting...

This code is correct, assuming that your rules are ordered correctly and that this code is located in a .htaccess file in the same directory as your robots.txt and sitemap.xml files and the "home page" of your site.

I've found some netlinking goes to a wrong URL: [atable.com...] How would I redirect from this wrong directory?

Modified to accept optional trailing slash:

# Another specific Redirect to WP Homepage
RewriteRule ^a_table_traiteur/?$ http://www.atable.com/? [R=301,L]

Jim

agneslesage



 
Msg#: 4182512 posted 8:10 am on Aug 12, 2010 (gmt 0)

Thanks for the answer, I ve put the new redirect (2nd question).... and it works fine!

Now the rewrite with index.php seems to rewrite without parameters as stated, fine as well. May be I doid sthg wrong along the test...

But in fact, I am worried to lose my Google Analytics (?utm_somethings=) and Adwords (?glcid=)parameters somewhere along the way.
So I should may be remove the question mark on this one, no?
I don't think I have a risk of duplicate content, I dont understand the point...


# Redirect direct client requests for URL-path /index.php to / and for /folder/index.php to /folder/
# Likewise for default.php direct requests in root or in folder,
# and force www and strip parameters at same time, to avoid duplicate content.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*(index|default)\.(php|html?|asp)([?#][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*)(index|default)\.(php|html?|asp)$ http://www.atable.com/$1? [R=301,L]


My whole code comes:


###### PHP compatibility ######

AddType x-mapp-php5 .php

###### WordPress original code #####

# BEGIN WordPress
# <IfModule mod_rewrite.c>
# RewriteEngine On
# RewriteBase /
# RewriteRule ^index\.php$ - [L]
# RewriteCond %{REQUEST_FILENAME} !-f
# RewriteCond %{REQUEST_FILENAME} !-d
# RewriteRule . /index.php [L]
# </IfModule>
# END WordPress

##### Wordpress Optimized code with my redirects ####
# code by http://www.webmasterworld.com/apache/4182512.htm

# Intro

RewriteEngine On
RewriteBase /

# My specific Redirect default.asp to WP Homepage
RewriteRule ^default\.asp$ http://www.atable.com/? [R=301,L]

# Another specific Redirect to WP Homepage
RewriteRule ^atable_traiteur/?$ http://www.atable.com/? [R=301,L]

# My Specific Redirects from old site (.asp URLs to WP+SEF-format URLs)
# One may remove the "?" at the end of the target URL to keep parameters in the string.
RewriteCond %{QUERY_STRING} ^([^&]*&)*page=qui_nous(&.*)?$
RewriteRule ^template\.asp$ http://www.atable.com/qui-sommes-nous/? [R=301,L]
# and many more alike

# WP optimized code

# Redirect direct client requests for URL-path /index.php to / and for /folder/index.php to /folder/
# Likewise for default.php direct requests in root or in folder,
# and force www and strip parameters at same time, to avoid duplicate content.
# NB: should I really?
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*(index|default)\.(php|html?|asp)([?#][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*)(index|default)\.(php|html?|asp)$ http://www.atable.com/$1? [R=301,L]

# I'd add an additional redirect here to strip all parameters and force www
# for all requests with parameters that have made it thus far.

# Redirect requests for non-blank, non-canonical hostnames to canonical hostname
# NB: without removing the parameters in string so no final "?"
RewriteCond %{HTTP_HOST} !^(www\.atable\.com)?$
RewriteRule ^(.*)$ http://www.atable.com/$1 [R=301,L]

# Except for requests for /index.php and for the most-frequently-requested
# filetypes that WP cannot generate, rewrite all URL requests which do not
# resolve to an existing file or directory to the WordPress script filepath
RewriteCond $1 !^index\.php$
RewriteCond $1 !\.(gif|jpe?g|png|ico|css|js)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)$ /index.php [L]
# END WordPress

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4182512 posted 3:11 pm on Aug 12, 2010 (gmt 0)

Please provide more specific examples of Google Analytics Adwords URLs which represent all possible requests for which you might want to keep the query string. What do these URLs look like? Are the links always to just "pages"? The reason I ask is that you do not want to waste CPU time checking for specific query strings when it is not really necessary.

The problem is almost never writing the code. The problem is almost always related to defining the requirements completely and correctly. By defining the requirements in terms of variables (character-strings in the hostnames, URL-paths, filetypes, query strings, etc.) that mod_rewrite can check, the solution often becomes fairly obvious.

Jim

agneslesage



 
Msg#: 4182512 posted 10:08 am on Aug 13, 2010 (gmt 0)

Well, "The problem is almost never writing the code" ... for you!

Now let me try specify what I need...

1> I don't need anymore the old strings attached to my old URL
They were all attached to "template.asp" and the names of the parameters were "page=" (always there) and "title=" (sometimes there)
And all those old URL are beeing redirected to new ones, as we did: without their parameters, that is fine.

2> In my new site URL, I want to accepts certain strings:
- Adwords strings look like "?gclid=" and the value can be anything, generated for each clic by Adwords
- Google Analytics strings: "utm_source=", "utm_medium=", "utm_term=", "utm_content=", "utm_campaign=" , with values defined on the go during campaigns.
- MSN Adcenter strings: seems quite complex to define systematically as one may name the parameters the way one wants...
- ... what more in the future?
It is probably easier to exclude what I don't want than include what I use.
The above strings may be attached to any page of the site - so any URL WP has generated with nice URL ending "/name-of-page-or-category" or possibly "/name-of-page-or-category/" (or even "/name-of-page-or-category/index.php"?)". That is the homepage, WP pages, and also e-commerce categories and products.

3> Additionnally, I could try avoid displaying "products pages" that are not meant to display(I only normally display "subcategories"), but are generated wy e-commerce plugin and do exist.
For example I display
http://www.atable.com/traiteur/petits-dejeuners/brunchs/ with 3 products inside,
and should not display an individual product:
http://www.atable.com/traiteur/petits-dejeuners/brunchs/lintegral

With WP, all my pages are like:
http://www.atable.com/(name-of-page)
All my subcategories are under the follwoing style:
http://www.atable.com/traiteur/(name-of-category)/(name-of-subcategory)
Except one that goes 3 directories down:
http://www.atable.com/traiteur/buffets-et-cocktails/(name-of-subcategory)/(name-of-subsubcategory)
While I discover e-commerce plugin lets the products display (which should not) under:
http://www.atable.com/traiteur/(name-of-category)/(name-of-subcategory)/name-of-product
So a rule that would redirect 3rd directory request after "/traiteur/" to the parent directory, except if it were under "/traiteur/"buffets-et-cocktails" would do.

But that is little bit too much... and another issue than the strings above.

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4182512 posted 6:46 pm on Aug 13, 2010 (gmt 0)

I would need to know what URLs the Google queries are meant to be attached to. Without that knowledge, the code may be extremely inefficient...

Encore un fois -- examplee, s'il vous plait... :)

Jim

agneslesage



 
Msg#: 4182512 posted 8:50 am on Aug 16, 2010 (gmt 0)

ohoh, I did not know you could do it in French monsieur :)

Well, the Google adwords queries may be attached to many (if not all) URl of the site, as we have an elaborate campaign with deep links.

So the ones strings I want to avoid (1>) above are only to
www.atable.com/template.asp?page=xyz

and the ones I want to keep (2>) are with "?glcid=xyw" and "utm_media=xyz", "utm_source=xyz", "utm_content=xyz", "utm_campaign=xyz" to all possible URL ending "/name-of-page-or-category" or possibly "/name-of-page-or-category/" (or even "/name-of-page-or-category/index.php"?)".
Examples:
http://www.atable.com/qui-sommes-nous/?glcid=xyz
http://www.atable.com/traiteur/nouveautes?glcid=xyz
http://www.atable.com/traiteur/coffrets-repas/?glcid=xyz
http://www.atable.com/traiteur/coffrets-repas/le-lunch-bag-sac-pique-nique/?utm_source=test
http://www.atable.com/traiteur/buffets-et-cocktails/buffets/buffet-sandwich-tarte-salade-froid?utm_campaign=test&utm_media=mail


Again, I dont know with these WP URL and how engiens deal with them, if they may end with "name-of-the-page", "name-of-the-page/" or "name-of-the-page/index.php"

Have a nice monday if that can be!
Agnes

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4182512 posted 2:40 am on Aug 19, 2010 (gmt 0)

If you cannot track the adwords campaigns before they are redirected to remove the tracking query strings, then you'll have to get rid of the "remove query strings" function here. As a result, you will risk having duplicate content in those search engines which do not recognize these query strings for what they are.

A better approach for the long term might be to move your tracking function into a script, and then add code to call that script before redirecting to remove the query string. The script itself could do the redirects, or you could call the script from .htaccess if you have server configuration access and can use RewriteMaps.

With the current set-up, there is really no good way to resolve the duplicate content problem and preserve tracking strings at the same time.

Jim

agneslesage



 
Msg#: 4182512 posted 6:20 am on Aug 19, 2010 (gmt 0)

Thanks Jim!

I rather keep with the first option (remove the strings), as otherwise this is getting to complicate for me.
I happen not to have cross content on this site... and I dont think the way campaigns go, there is a chance robots track it and find duplicate content, so it should not be too bad.

Now, I have another 1 or 2 issues which are related.
In fact, it may give trouble for more people applying the optimized instructions provided, so it is good for the record: it seems one of the above rule is causing problem in WP admin menu of the e-commerce plugin(for editing categories)
Here is the guilty section:


# Redirect direct client requests for URL-path /index.php to / and for /folder/index.php to /folder/
# Likewise for default.php direct requests in root or in folder,
# and force www and strip parameters at same time, to avoid duplicate content.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*(index|default)\.(php|html?|asp)([?#][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*)(index|default)\.(php|html?|asp)$ http://www.atable.com/$1? [R=301,L]


I see 3 options:

1> remove the whole rule
Unless it affect something else in the row of instructions? I dont think so. It is probably not a very indispensable rule...?

2> remove the question mark
as it seems to be the striping off the parameters that mess up the admin

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*(index|default)\.(php|html?|asp)([?#][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*)(index|default)\.(php|html?|asp)$ http://www.atable.com/$1 [R=301,L]

Dos this still make sense? I think so...
And that way I also solve my issue of Google campaigns tracking. Please confirm this is OK

3> change the condition
in order not to apply this rule in wp-admin directory
But this is something only an expert like you knows to do...!

+> Another issue:
This htaccess is now causing problems if I leave it on root of my local version of the site, as it puts "www.atable.com" on the above rule, as well as this one:


# Redirect requests for non-blank, non-canonical hostnames to canonical hostname
# do not remove the parameters in string, so skip the original final "?" on target URL
RewriteCond %{HTTP_HOST} !^(www\.atable\.com)?$
RewriteRule ^(.*)$ http://www.atable.com/$1 [R=301,L]


I wonder if there is a way to "relativize" them. (it is always source of problem to have absolute paths and different version to manage).

agneslesage



 
Msg#: 4182512 posted 6:32 am on Aug 19, 2010 (gmt 0)

PS:ARG, I see my preferred option above (#2) does not work, I still have a problem on e-commerce plugin admin, editing category
On

/wp-admin/admin.php?page=wpsc-edit-groups

a link

<a href='#' onclick='return showedit_categorisation_form()'>

does not act properly anymore.
If I comment the rule

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*(index|default)\.(php|html?|asp)([?#][^\ ]*)?\ HTTP/
RewriteRule ^(([^/]+/)*)(index|default)\.(php|html?|asp)$ http://www.atable.com/$1 [R=301,L]

Then it works again.
So I am left with that solution #1, I am not sure what I lose... and if there is better.

Also, I am still interested if you have a suggestion for relativizing the rules.

Am I not abusing you!?
Thanks

This 41 message thread spans 2 pages: 41 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved