Forum Moderators: phranque

Message Too Old, No Replies

findme

rewrite conditional as regex, no rewriteCond

         

sergiozambrano

10:23 pm on Oct 5, 2011 (gmt 0)

10+ Year Member



I wrote a regex for a rewrite rule that includes a negative lookahead to replace the rewriteCond line, because WordPress only accepts two values: pattern -> substitution.

It should find findme.html _here, regardless of where it's requested to be:
mydomain.com/_here/findme.html

e.g.

(Sorry, I can't modify the swf which will request findme.html in the wrong places)

So, given findme.html could be requested to be in, e.g.:

mydomain.com/findme.html
mydomain.com/directory/findme.html
mydomain.com/directory/findme.html?someparam=3


The rewrite should make them all

mydomain.com/_here/findme.html


So, I made a rewrite rule that Wordpress will accept me as follow

Options +FollowSymlinks
RewriteEngine On
RewriteRule ^.*(?!_here)/*findme\.html$ /_here/findme.html [R=301,L]


The negative lookahead is there so it only matches URLs which doesn't contain "_here" in it, to prevent extra rewriting or a loop.

The problem is IT DOES loop.

What did I miss?

lucy24

6:26 am on Oct 6, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sit tight. Nobody is ignoring you. Although when g1 comes around and sees that leading .* in your Rule, you will wish everyone was ignoring you.

Don't know about anyone else, but I got stunned into near-speechlessness at the idea of being brave enough, or foolhardy enough, to put a lookahead in mid-statement anywhere, let alone in htaccess.

WordPress won't allow RewriteCond? There's got to be a way around that.

g1smd

6:47 am on Oct 6, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The
^.*
part matches the whole input string right to the very end. The parser then sees there is more stuff in the pattern and it then has to perform hundreds of "back off and retry" trial matches to find out what you really meant.

RewriteCond %{REQUEST_URI} !here
is the usual way to prevent a loop.

/*
means "any number of consecutive slashes, including none" - that is surely an error.

The redirect target should contain protocol and domain name.

sergiozambrano

10:54 am on Oct 6, 2011 (gmt 0)

10+ Year Member



There's probably a way around to get WP to "print" the whole conditional and rule to the .htaccess file, but I want it to be aware of what it's adding. There's must be a reason why they did it this way. I'm looking into that in the while.

I have the same .* working fine, without the neg lookaround, with a conditional before it, but I think the regex can do it by itself.

I tested the regex with gskinner.com tool and worked fine, but thanks for the helpful answer. I'll change that * and will keep you posted.

sergiozambrano

11:10 am on Oct 6, 2011 (gmt 0)

10+ Year Member



[UPDATE]
I changed the rule to:
RewriteRule ^.*?(?!_here/)findme\.html$ /_here/findme.html [R=301,L] 


I made the first dot lazy, and added the slash inside the neg lookaround because if _here was found it will always be followed of one.

Still looping :(

lucy24

6:26 pm on Oct 6, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Lookaheads are lovely things, but frankly I wouldn't trust one here.

Here is the basic way to pick up groups of directories:

([^/]+/)*

Do any of your other directories contain _ ? If not, you have got it made, because all you need is

^([^/_]+/)*findme\.html$

If they might contain lowlines, but never begin with one, the pattern becomes

^([^_][^/_]+/)*findme\.html$

Replace the + with a * if you have directories with-- ugh!-- single-letter names.

If, on the other hand, you don't mean literal "_here" at all but are just using that as a made-up example, you are going to have to talk firmly to your WP installation and force it to go along with Conditions. Which creates a pleasant mental picture.

Oh, and: Where you said "lazy" I think the standard term is "stingy". (Without the ? it is "greedy".) This kind of ? is useful when you want to stop at the first occurrence of something that might occur many times, like the word "the" in a text file. But in htaccess there is almost always a cleaner alternative.

g1smd

7:59 pm on Oct 6, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You should never begin a pattern with .* as it's the root cause of so many problems we see in this forum.

The redirect target should contain protocol and domain name.