Forum Moderators: phranque
I've got some code that keeps getting appended onto some urls:?start=0&postdays=0&postorder=asc&highlight=
I'd like to use mod rewrite to drop it until I can find out where in the code it's being generated.
I'd like to take the url from being "topic-vt123.html?start=0&postdays=0&postorder=asc&highlight=" to just being "topic-vt123.html"
I tried this code:
RewriteRule ^forums/(.+vt[0-9]+)\.html?start=0&postdays=0&postorder=asc&highlight=$ http://www.example.com/forums/$1.html [R=301,L]
But it didn't work.
Can anyone help me out with the code above?
Thanks!
RewriteCond %{QUERY_STRING} ^start=0&postdays=0&postorder=asc&highlight=
RewriteRule ^forums/([^\-]+-vt[0-9]+)\.html$ http://www.example.com/forums/$1.html? [R=301,L]
Jim
Jim
redirect 301 /old/old.shtml http://www.example.com/new/new.shtmlRewriteEngine On
RewriteBase /
RewriteCond %{QUERY_STRING} ^start=0&postdays=0&postorder=asc&highlight=
RewriteRule ^sub/([^\-]+-vt[0-9]+)\.html$ http://www.example.com/sub/$1.html? [R=301,L]
RewriteRule ^sub/(.+vt[0-9]+)_([0-9]+)\.htm$ http://www.example.com/sub/$1-$2.html [R=301,L]
RewriteRule ^sub/([^_]+)_([^.]+)\.html$ http://www.example.com/sub/$1-$2.html [R=301,L]
RewriteRule ^sub/([^_]+)\.htm$ http://www.example.com/sub/$1.html [R=301,L]
RewriteRule ^sub/.*-vp([0-9]+)\.html$ http://www.example.com/sub/post$1.html [R=301,L]
RewriteRule ^sub/.+/([^/]+\.html)$ http://www.example.com/sub/index.php [R=301,L]
RewriteCond %{HTTP_HOST}!^www.example\.com [NC]
RewriteRule ^(.*) http://www.example.com/$1 [QSA,R=301,L]
RewriteCond %{REQUEST_FILENAME}!-f
RewriteCond %{REQUEST_FILENAME}!-d
RewriteRule . /wp-index.php [L]
RewriteRule ^sub/.+-vc([0-9]+)\.html$ /sub/index.php?c=$1 [QSA,L]
RewriteRule ^sub/.+-vf([0-9]+)-([0-9]+)\.html$ /sub/viewforum.php?f=$1&start=$2 [QSA,L]
RewriteRule ^sub/.+-vf([0-9]+)\.html$ /sub/viewforum.php?f=$1 [QSA,L]
RewriteRule ^sub/.+-vt([0-9]+)-([0-9]+)\.html$ /sub/viewtopic.php?t=$1&start=$2 [QSA,L]
RewriteRule ^sub/.+-vt([0-9]+)\.html$ /sub/viewtopic.php?t=$1 [QSA,L]
RewriteRule ^sub/post([0-9]+)\.html$ /sub/viewtopic.php?p=$1 [QSA,L]
RewriteRule ^sub/member([0-9]+)\.html$ /sub/profile.php?mode=viewprofile&u=$1 [QSA,L]
RewriteRule ^sub/sitemaps\.html$ /sub/sitemaps.php [QSA,L]
RewriteRule ^sub/mx-map\.html$ /sub/sitemaps.php?mx [QSA,L]
RewriteRule ^sub/forum-map\.html$ /sub/sitemaps.php?fim [QSA,L]
RewriteRule ^sub/.+-fmp([0-9]+)-([0-9]+)\.html$ /sub/sitemaps.php?fmp=$1&start=$2 [QSA,L]
RewriteRule ^sub/.+-fmp([0-9]+)\.html$ /sub/sitemaps.php?fmp=$1 [QSA,L]
RewriteRule ^sub/.+-sc([0-9]+)\.html$ /sub/sitemaps.php?c=$1 [QSA,L]
RewriteRule ^sub/rss-?(l¦s)?-?(m)?\.([xml¦xml\.gz]+)$ /sub/rss.php?$1&$2 [L]
RewriteRule ^sub/sub-rss-?(l¦s)?-?(m)?\.([xml¦xml\.gz]+)$ /sub/rss.php?forum&c&$1&$2 [L]
RewriteRule ^sub/([a-z]+)-rss([0-9]*)-?(l¦s)?-?(m)?\.([xml¦xml\.gz]+)$ /sub/rss.php?$1=$2&$3&$4 [L]
RewriteRule ^sub/.+-rf([0-9]+)-?(l¦s)?-?(m)?\.([xml¦xml\.gz]+)$ /sub/rss.php?forum=$1&$2&$3 [L]
RewriteRule ^sub/sitemaps\.([xml¦xml\.gz]+)$ /sub/sitemap.php [L]
RewriteRule ^sub/([a-z]+)-sitemap\.([xml¦xml\.gz]+)$ /sub/sitemap.php?$1 [L]
RewriteRule ^sub/.+-gf([0-9]+)\.([xml¦xml\.gz]+)$ /sub/sitemap.php?forum=$1 [L]
RewriteRule ^sub/urllist\.([txt¦txt\.gz]+)$ /sub/urllist.php [L]
# start mod_gzip
mod_gzip_on Yes
mod_gzip_can_negotiate Yes
mod_gzip_static_suffix .gz
AddEncoding gzip .gz
mod_gzip_update_static No
mod_gzip_command_version '/mod_gzip_status'
mod_gzip_minimum_file_size 500
mod_gzip_maximum_file_size 500000
mod_gzip_maximum_inmem_size 60000
mod_gzip_min_http 1000
mod_gzip_dechunk Yes
mod_gzip_add_header_count Yes
mod_gzip_send_vary Yes
# mod_gzip_temp_dir /tmp
# mod_gzip_keep_workfiles No
# not implimented yet, compression_level, maybe next version
# mod_gzip_compression_level9
mod_gzip_handle_methods GET POST
mod_gzip_item_exclude reqheader "User-agent: Mozilla/4.0[678]"
mod_gzip_item_exclude mime ^image/
mod_gzip_item_include file \.html$
mod_gzip_item_include file \.shtml$
mod_gzip_item_include file \.htm$
mod_gzip_item_include file \.shtm$
mod_gzip_item_include file \.php$
mod_gzip_item_include file \.phtml$
mod_gzip_item_include file \.js$
mod_gzip_item_include file \.css$
mod_gzip_item_include file \.pl$
mod_gzip_item_include handler ^cgi-script$
mod_gzip_item_include mime ^text/html$
mod_gzip_item_include mime ^text/plain$
mod_gzip_item_include mime ^httpd/unix-directory$
AddHandler application/x-httpd-php .php .shtm .shtml .htm .html .tpl .xml .txt
Options +FollowSymlinks -Indexes
<Files .htaccess>
deny from all
</Files>
Note 1) I put the code you recommended before to remove the query string as the first rewrite rule. 2) I added the line breaks to make things easier to read.
I tried re-ordering the htaccess file a little and the query string rewrite still doesn't work.
Any obvious fixes in there?
So the rule pattern needs to change:
RewriteCond %{QUERY_STRING} ^start=0&postdays=0&postorder=asc&highlight=
RewriteRule ^forum[b]s/(([^\-]+-)+v[/b]t[0-9]+)\.html$ http://www.example.com/forums/$1.html? [R=301,L]
This new pattern will now match the semi-anonymized example URL-path and query you provided (via stickymail), "/forums/bull-running-in-spain-vt1306.html?start=0&postdays=0&postorder=asc&highlight="
Despite its apparent complexity, this is a far more efficient pattern than the ".+-vt" that occurs in several of your other rules, because "([^\-]+-)+" allows the requested URL to be matched in a single left-to-right pass, unlike the ambiguous ".+-", which forces multiple retry attempts to find a match.
There is a second (and potentially major) problem that recurs several times in your code. You should replace the improperly-coded subpattern "([xml¦xml\.gz]+)" pattern with "(xml(\.gz)?)" wherever you find it. However, you should then re-evaluate the back-references ($1-$9) and make sure that adding that additional set of parentheses won't require you to change the back-reference number in the URL substitution. Because the "xml.gz" occurs at the end of the requested URL-path, I did not find any cases where that would be necessary, but keep this in mind.
The reason that the "([xml¦xml\.gz]+)" pattern is improperly coded is that characters in square brackets have no position dependency; The square brackets enclose a group of alternate characters, any of which will be accepted as a match. Therefore, the pattern [abc] will match "a", "b", or "c", and so will the pattern [cba] or [cab].
So, "[xml¦xml\.gz]+" will match any string containing (only) one or more of the characters "x", "m", "l", "g", "z", "." or "¦" in any order, which is obviously not what was intended. Try requesting a URL ending in "zg.lmx" (or even just one of any of those characters) and see what happens; If the rest of the URL matches one of your rules' URL-patterns, it will be rewritten!
Jim
If you're at the next pubcon, I owe some drinks.
It worked perfect, and my site definitely sped up thanks to optimizing some of those rules.
Hopefully it'll help take some pages out of supplemental now.
Thanks so much!