Forum Moderators: phranque

Message Too Old, No Replies

Which regex is more effcient?

2 examples

         

StaceyJ

8:19 pm on Jan 13, 2011 (gmt 0)

10+ Year Member



I have a few (7) URLs I need to redirect that are very similar such as:

www.example.com/this/that/theother/word1word2word3

where word1 will vary but word2word3 will stay the same. Which would be more efficient regex, or is there a better way?

RewriteRule ^this/that/theother/(\w)word2word3$

or

RewriteRule ^this/that/theother/(word1|word11|word111)word2word3$

Thank you!

wilderness

8:42 pm on Jan 13, 2011 (gmt 0)

StaceyJ

9:21 pm on Jan 13, 2011 (gmt 0)

10+ Year Member



I left out the + in the first example, should be

RewriteRule ^this/that/theother/(\w+)word2word3$

g1smd

1:11 am on Jan 14, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The PCRE pattern (\w+) matches all the way to the end of the input and then backs off and retries many times until a match is found. I try to avoid this.

The other pattern using "local OR" logic, is OK when there is only a small number of options, but becomes unwieldy (and slow{er}) when the list is long.

If there were a separator between the words, your pattern could be ([^\-]+) which means "read until the next hyphen". Readable in a single left-to-right pass this is the fastest of all.

StaceyJ

3:47 am on Jan 14, 2011 (gmt 0)

10+ Year Member



The PCRE pattern (\w+) matches all the way to the end of the input and then backs off and retries many times until a match is found. I try to avoid this.

Thanks, I've never used \w+ before, but it itseemed at least better than (.+) here for a start.

The other pattern using "local OR" logic, is OK when there is only a small number of options, but becomes unwieldy (and slow{er}) when the list is long.

There would be 7 OR's in this case.

If there were a separator between the words, your pattern could be ([^\-]+) which means "read until the next hyphen". Readable in a single left-to-right pass this is the fastest of all.

I know, but there aren't, all words are run together. There are separators in the substitution URL, but that doesn't help. I was racking my brain trying to find a negative pattern match, but I can't. There is nothing that separates anything, they are 3 words run together.

Thank you for your input!

jdMorgan

2:43 pm on Jan 14, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The local-OR option would be much faster than "\w+" or ".*" because it will reduce the number of characters which much be repeatedly "backed off" as g1smd pointed out.

Jim