homepage Welcome to WebmasterWorld Guest from 54.196.162.238
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
AHHH! Rewrite Rule results in a 400 Bad Request
Cant redirect a url request
miketheman




msg:4012085
 8:29 am on Oct 23, 2009 (gmt 0)

Hi everyone, Can I get anyones thoughts on why this line of code isn't working, please.

::Code Snippet::
#Redirect old page requests
Options +FollowSymLinks
RewriteEngine on
RewriteRule posting-id-(.*)-keywords-(.*)\.nai$ posting.nai?id=$1&keywords=$2

::Result::
Bad Request

Your browser sent a request that this server could not understand.

I do have mod_rewrite and mod_alias on hmmmm. I'll continue to read up some manuals to trouble shoot, but anyones input is highly appreciated. It might be some prior codes or something, I can't even rewrite a url request that comes in.

::example::
RewriteRule ^(iqc)(.*) [domain.com...] [NC]

Dosen't work....

 

jdMorgan




msg:4012195
 2:35 pm on Oct 23, 2009 (gmt 0)

Use the [L] flag on every rule, unless you can justify not doing so... 99% of all rules should have an [L] flag.

Be careful of the distinction between an internal rewrite and an external client redirect. The former is a URL-to-filepath translation occurring inside the server, while the latter is a URL-to-URL translation which requires the client to cooperate in handling a redirection response from your server. The mod_rewrite syntax for these two functions is quite different.

Test with a very simple, correctly-formed rule, such as

RewriteRule ^foo\.html$ http://www.google.com/ [R=301,L]

and then examine your server error log file to see what error is being reported and why.

The next step is to redirect or rewrite to a page on your own site, but make it a static HTML page to eliminate potential scripting problems. Finally, test again using your script as the target.

If you find that your script is producing invalid redirect responses, the "Live HTTP Headers" add-on for Firefox/Mozilla browsers may prove useful in examining the HTTP transactions between your test client and the server.

Once those test rules are working, I recommend that you address the terribly-inefficient regular-expressions patterns in your original rule. Your server may run noticeably faster if you use the [L] flag and much-more specific subpatterns such as

RewriteRule ^posting-id-([^-]+)-keywords-([^.]+)\.nai$ /posting.nai?id=$1&keywords=$2 [L]

Also, in your last example, there is no need for the first set of parentheses; they can be removed and the back-reference changed to $1 instead of $2.

The several errors and inefficiencies in your code snippets indicate that you may benefit from studying the resources cited in our Forum Charter. As illustrated by my comments above, very tiny changes in mod_rewrite code can have huge effects on the performance and reliability of your server. The smallest of errors can take down your server immediately (if you're lucky) or sit quietly for years, slowly destroying your search engine rankings. So, 'guessing and experimentation' is not advisable. It is best to proceed from a solid base of knowledge.

Jim

[edited by: jdMorgan at 8:52 pm (utc) on Oct. 23, 2009]

miketheman




msg:4012406
 8:31 pm on Oct 23, 2009 (gmt 0)

Thank you again Jim. I think I really need to do some studing on regex patterns and Rewrite structures.

I did get some guidance from this thread that you guys created (thank you by the way).

[webmasterworld.com...]

miketheman




msg:4012409
 8:45 pm on Oct 23, 2009 (gmt 0)

I get this error in my log:
[Fri Oct 23 13:39:22 2009] [error] [client nn.nn.nn.nn] File does not exist: C:/Program Files/Apache Software Foundation/Apache2.2/htdocs/foo.html

I think it may be a module not on?

I tested this example:
Options +FollowSymLinks
RewriteEngine on
RewriteRule ^foo\.html$ [google.com...] [R=301,L]

jdMorgan




msg:4012417
 8:51 pm on Oct 23, 2009 (gmt 0)

Where is this test code located? -- In a server config file or in a .htaccess file?

If in a config file, is it in inside a <Directory> section or not? If so, what directory-path is declared?
If in a .htaccess file, what directory is it in, relative to the root ('home page' directory) of your web site?

Jim

miketheman




msg:4012418
 8:54 pm on Oct 23, 2009 (gmt 0)

It's in a config file.....Man, I can't believe I forgot to wrap the rule in the correct directory tags.....ITS WORKING NOW!

Thank you Jim.

miketheman




msg:4012420
 8:58 pm on Oct 23, 2009 (gmt 0)

Yeah, it was just that simple. Since I was pulling my old codes from my htaccess which was in the proper directory and integrating them into my config file, I forgot to specify which directory the rule applied to.

I felt it was something simple like this too.

Thank you again.

miketheman




msg:4012516
 1:57 am on Oct 24, 2009 (gmt 0)

Hey Jim or anyone. I've been studing up on RewriteRule and Regex. I understand that anything in square brackets are a range; like [0-9] is any number 0-9 and [a-z] is any lowercase letter......so my question.

In the example you gave Jim,
RewriteRule ^posting-id-([^-]+)-keywords-([^.]+)\.nai$ /posting.nai?id=$1&keywords=$2 [L]

I want to try to understand each part, so I can cut down on my questions here and not ruin my server like Jim said.

the ^ ::is the start of the string::
the () ::dont know, I think Jim said it was whitespace acceptance::
the [] ::range::
THIS THOUGH ^- ::what range is this defining?::

Thank you anyone, there's nothing wrong with the code of course, just trying to understand it.

jdMorgan




msg:4012541
 3:12 am on Oct 24, 2009 (gmt 0)

Parentheses define a back-reference to a subpattern match for later use, or can be used with a quantifier to indicate that the subpattern within the preceding parentheses should be matched a certain number of times.

Square brackets define an alternate-character group, any member of which should be accepted as a match.

Escaping rules and token meanings change within groups.

Groups may contain lists of characters, or alphabetic or numeric string ranges.

Any character within the group is considered an acceptable match.

Everything in a group is considered to be a character, not a string. i.e. [abc] matches "a" or "b" or "c" not "abc". Example: "[bcdfghj-np-tv-z]" means, "Match any consonant." (You could also write it as "[b-df-hj-np-tv-z]" which is somewhat less efficient.)

Grouped alternates may be quantified, that is, a regex quantifier such as "?" or "*" or "+" or "{1,3}" may follow the group, and will define how many members of that group are required to be present at this position in the string being matched.

Take a look at the regular expressions tutorial cited in our Forum Charter -- You should have picked up on most of this stuff from any decent regex guide.

---

"[^-]+" means "Match one or more characters not a hyphen," or equivalently, "Match until you find the next hyphen." As the first character within a group "^" means "NOT". The parentheses around that subpattern mean that we want to store the matched part of the input string for later use as a back-reference, e.g. as "$1" in your substitution path.

This negative-match pattern is much more efficient than using ".*" which means, "Match zero or more of any character(s), and match as many characters as possible." If that pattern is used, it will 'consume' the entire input string -- all the way to the end, and if additional subpatterns follow that ".*" then they will initially be 'starved' and the match will fail. The matching engine will then have to 'back off' from the end of the input string one character at a time, trying again and again to find a match for the whole pattern.

If you use multiple ".*" subpatterns you get into recursion, and potentially force dozens, hundreds, thousands, or even tens of thousands of 'back off and re-try' matching attempts, depending on how many 'promiscuous and greedy' ".*" subpatterns are present in the pattern and on the length of the input string to be matched.

By contrast, using negative-match patterns allows the string to be matched in a single left-to-right pass... :)

You will note that in the Apache mod_rewrite documentation and in the accompanying Apache URL Rewriting Guide, there is an almost utter lack of use of the ".*" pattern... and this is a very strong hint that it should be very rarely used. However, because most people don't/won't "read the book," and because the match-anything ".*" is an easy-to-remember one-size-fits-all subpattern, you'll find truly awful (grossly inefficient) code posted all over the Web.

Until recently, most of the 'stock mod_rewrite code' that came with WP and Joomla and many control panels met this description -- we're working on that problem by example here, and I've seen signs that some of them may be reading... :)

How important is this? Sometimes not very, but possibly critical; I've seen sites which were about to be forced into major server upgrades 'saved' by simple corrections to their (previously-awful) regular-expressions patterns; The sites went from being agonizingly-slow and unusable to being speedy and responsive simply by making the change discussed here (albeit in several rules, not just one). In short, writing the most efficient regex patterns possible is just a very good habit to develop and adhere to. This applies whether you're using regex in mod_rewrite, other Apache modules, CGI-SSI, PERL, PHP, 'C', JavaScript, or any other configuration, programming, or scripting 'languages' that make use of regular-expressions pattern-matching.

Jim

miketheman




msg:4012558
 3:58 am on Oct 24, 2009 (gmt 0)

WOW, I never knew regex is like a double edge sword. Really powerful if used right, yet can destroy my site if used improperly.

Definitely will avoid: .* in my patterns. Well instead playing Halo tonight looks like I'm going go through my scriptings ([lol]+)

I do feel these are more efficient in their places (PLEASE ANYONE CORRECT ME IF I'M WRONG).
[0-9]+ ::one or more characters that are a number::
[A-Z]+ ::one or more characters that are Cap letters::
[a-z]+ ::one or more characters that are low letters::
[^-]+ ::one or more characters NOT ^ the character - then when found, quit looking (is that right)

Then to make it not more than one character remove the + sign

I'll be putting up my new regex rewrite rule for positive criticism. I can say Jim, you made a new wrinkle in my brain =)
To store it in a string, the match, surround it in parentheses ()

I think the best thing I like about the negative match pattern is it is one-way and will not read over it's self. So as you said, I'm guessing that results in increased performance and given this is my server config. If it is poorly written , I can expect poor pages every load (because the config is read with every request).

miketheman




msg:4012576
 5:44 am on Oct 24, 2009 (gmt 0)

I guess I'm still having an issue trying a literal match of a special character: ?
so far I have...
RewriteRule ^buy\.php$ /posting_$1 [R=301]

The above works pass the header (301 code) and the variables after .php

but I'm trying to match only exact cases of
RewriteRule ^buy\.php\?listingid=$ /posting_$1 [R=301]

to result in /posting_######

isn't the \ suppose to escape special characters like ?

miketheman




msg:4012580
 6:03 am on Oct 24, 2009 (gmt 0)

Man, this doesn't work either... *rips hair out*
::request:
domain.com/buy.php?listingid=542354

RewriteRule ^buy\.php[^=]([0-9]+)$ /posting$1 [R=301]
RewriteRule ^posting([0-9]+)(.*)$ /posting.nai?id=$1 [L]

::result should be::
domain.com/posting542354

miketheman




msg:4012600
 6:33 am on Oct 24, 2009 (gmt 0)

Tried ascii character..nope

RewriteRule ^buy\.php[&#63;]listingid=$ /posting$1 [R=301]
RewriteRule ^posting([0-9]+)(.*)$ /posting.nai?id=$1 [L]

miketheman




msg:4012601
 6:34 am on Oct 24, 2009 (gmt 0)

and escaping \?

g1smd




msg:4012699
 1:13 pm on Oct 24, 2009 (gmt 0)

RewriteRule cannot see query strings.

You need a RewriteCond looking at QUERY_STRING for that.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved