Forum Moderators: phranque

Message Too Old, No Replies

Trouble with one Rewrite

apache rewrite

         

PartyPooper

11:04 pm on Jan 2, 2009 (gmt 0)

10+ Year Member



Hello,

Please bear with me I'm kinda new at this. I've read the documentation and several tutorials. I have made several Rewrites and they are all functioning with no problems with the exception of this one. It never rewrites.

Here's a sample URI -
http://www.example.com/news.php?current=viewrss&rssid=2&linkid=5

And the Rewrite -
RewriteRule ^(.*)-rid-([0-9]+)-link-([0-9]+).html$ news.php?current=viewrss&rssid=$2&linkid=$3 [L]

I have a similar one that is successful;
Here's a sample URI -
http://www.example.com/news.php?current=viewrss&rssid=2

And the working Rewrite -
RewriteRule ^(.*)-rid-([0-9]+).html$ news.php?current=viewrss&rssid=$2 [L]

Any help is appreciated.

g1smd

11:34 pm on Jan 2, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That's a very dangerous rewrite, because the $1 part isn't verified and creates infinite Duplicate Content.

In this context it is also very inefficient too.

.

Your sample URI is one that directly accesses the script using parameters.

You didn't supply an example URI for use with the rewrite. That is, your example URL doesn't use the rewrite.

jdMorgan

11:37 pm on Jan 2, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



We need to see the URLs you are requesting as well as these query-string filepaths that you expect them to be rewritten to. What does a "rid-number-link-number.html" URL actually look like?

You also need to be aware that "(.*)" is going to match anything and everything, and you may need a much-more-specific subpattern everywhere you tried to use that.

Also, the code's behavior will depend on the order in which these two rule (and any other rules) occur. Don't be shy -- post them.

Jim

PartyPooper

12:14 am on Jan 3, 2009 (gmt 0)

10+ Year Member



An example URL for (it's the one that's working);

RewriteRule ^(.*)-rid-([0-9]+).html$ news.php?current=viewrss&rssid=$2 [L]

If RSS newsfeed was called "California Politics" would be; http://www.example.com/California-Politics-rid-2.html

RSSID is unique id in the RSS db.

I have a feeling of impending doom by posting this but here is the entire rewrite

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_HOST} !^www.example.com [NC]
RewriteCond %{HTTP_HOST} !^$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
RewriteCond %{SERVER_PORT} !^443$
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*home.php\ HTTP/
RewriteRule ^(([^/]+/)*)home\.php$ http://www.example.com/$1 [R=301,L]
RewriteRule ^All-News-view.html$ news.php?current=cat [L]
RewriteRule ^(.*)-nid-([0-9]*).html$ news.php?current=view&nid=$2 [L]
RewriteRule ^(.*)-rid-([0-9]+)-([0-9]+).html$ news.php?current=viewrss&rssid=$2&linkid=$3 [L]
RewriteRule ^(.*)-rid-([0-9]+).html$ news.php?current=viewrss&rssid=$2 [L]
</IfModule>

g1smd

12:32 am on Jan 3, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




RewriteBase / 
is the default and so it isn't needed.

.

RewriteCond %{HTTP_HOST} !^$
--- "is not blank"

can be simplified to

RewriteCond %{HTTP_HOST} [b].[/b] 
--- "contains something". The dot is vital.

.

Do yourself a huge favour and add a blank line after each Rule.

Then, add a # comment before each block to describe what the next block [(rule) or (condition and rule)] actually does.

g1smd

12:47 am on Jan 3, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The
(.*)
notation is dangerously inefficient, but there is another problem. For your URL:

http://www.example.com/California-Politics-rid-2.html

I could request:

http://www.example.com/Communists-In-The White-House-Ban-All-Politics-rid-2.html

and your site would return the exact same content because you don't validate the value in $1 with what the real title for that content actually is.

The only way to do that is to pass the $1 value to the script and make it look up the real title in the database.

If the title is correct, then serve the content.

If the title is incorrect, then send a 301 redirect to the correct URL with the correct title included within that new URL.

jdMorgan

3:22 am on Jan 3, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Doom: Removing the unnecessary <IfModule> and RewriteBase, reordering the rules for correct function, and making several performance and functional tweaks and improvements, I'd recommend:

RewriteEngine on
#
# Externally redirect direct client requests for <anydir>/home.php
# to http://www.example.com/<anydir>/ except for SSL requests
RewriteCond %{SERVER_PORT} !^443$
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*home.php\ HTTP/
RewriteRule ^(([^/]+/)*)home\.php$ http://www.example.com/$1 [R=301,L]
#
# Externally redirect to canonical domain if the requested
# hostname is not *exactly* www.example.com or blank
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
#
# Internally rewrite SEO-friendly URLs to news.php script
RewriteRule ^All-News-view\.html$ news.php?current=cat [L]
RewriteRule ^([^\-]+-)+rid-([0-9]+)-([0-9]+)\.html$ news.php?current=viewrss&rssid=$2&linkid=$3 [L]
RewriteRule ^([^\-]+-)+rid-([0-9]+)\.html$ news.php?current=viewrss&rssid=$2 [L]
RewriteRule ^([^\-]+-)+nid-([0-9]+)\.html$ news.php?current=view&nid=$2 [L]

Even with all that, it is not clear why your "rssid linkid" rule didn't work, since the example URL you provided was for the other rule. I can't tie it all together without the requested URL, the target filepath, and the rule. It should work fine for a URL like "www.example.com/California-Politics-rid-22-33.html" and rewrite that specific URL to the script at the filepath /news.php with a query string of "current=viewrss&rssid=22&linkid=33"

This assumes that the script knows what to do with the additional linkid parameter, but even if it didn't, the rewrite itself should still work.

The comments g1smd made about the unchecked "title" string at the beginning of these URLs are right on target. This is a duplicate-content vulnerability that is quite dangerous. The suggested solution of passing that string to the script, having the script validate it against the database and rejecting the request with a 404 or 301-redirecting it if the "title" doesn't validate is indeed the best practice.

One more thing: It appears that the site supports HTTPS, in which case the second redirect rule should be modified to preserve the HTTP/HTTPS protocol:


# Externally redirect to canonical domain if the requested hostname is not
# *exactly* www.example.com or blank, preserving the HTTP/HTTPS protocol
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteCond %{SERVER_PORT}s ^(443(s)¦[0-9]+s)$
RewriteRule (.*) http%2://www.example.com/$1 [R=301,L]

If you don't use SSL, then you don't need this, and you can just remove the first RewriteCond in the main code block above -- the one that looks for port 443.

If you do use this code, replace the broken pipe "¦" character with a solid pipe character before use; Posting on this forum modifies the pipe characters.

Jim