Forum Moderators: open

Message Too Old, No Replies

Regular Expressions

Love? Hate? Or just another thing

         

Anyango

10:45 am on Feb 19, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I hate Regular expressions, they remind me of college days where we had to study "Theory of Automata" and those regular expressions just kept our heads spinning all the time in loops.
Maybe for same reason i hate to use preg_* functions in php. And also for any .htaccess entries.

What is your take on them ? do you love them, hate them, or you think they are just another artifact like many others and you happily use them.

I know one guy who Loves them though, jdMorgan

Cheers

chrisranjana

12:14 pm on Feb 19, 2010 (gmt 0)

10+ Year Member



Add me to the "Loves Regex" group.

jatar_k

2:02 pm on Feb 19, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



they are exceptionally useful, which means I am required to use them, doesn't much matter how I feel about 'em

phranque

2:39 pm on Feb 19, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



regular expressions are a fundamental part of unix and therefore have been the basis for text pattern matching for all editors, languages, tools and OS related to *nix platforms for 40 years.

jatar_k

3:12 pm on Feb 19, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



and phranque loves them

rocknbil

6:03 pm on Feb 19, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Like a wife/husband/s.o., I love them when they are cooperative, hold them in contempt when they don't do what I want, and am forever in a power struggle to bend them to my will. :-)

At first it was complete contempt, I didn't understand their ways, then realized it was my shortcomings that were making regex's such a b***ch. One I grew to understand them, they slowly imparted their power to me.

After 16 years, we still butt heads, still have times of struggle, but we need each other so it's a contract we can both live with.

jdMorgan

6:54 pm on Feb 19, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The OP is correct. I'd say "Love."

After having written dozens of parsers (take about "hate"!) over the years, I finally ran into Regular Expressions, and was impressed by the huge amount of power packed into such a compact descriptive "language." I thought, "Why didn't I think of that?" and "Wow, I sure could have saved a lot of time had I known about this earlier!" -- Coming from a hardware rather than a software background, I simply hadn't seen them before I'd been writing code for many years. (Oh and this was back in the days when computers took up whole rooms, and "personal computers" were just a dream in a few inventor's heads -- I suspect Bill Gates was still in High School...)

As a bit of hope for those frustrated by Regular Expressions, my observation is that the initial learning curve is very steep -- and for this reason: Reading regular expressions is very difficult, especially if every little nuance of the pattern isn't documented. But once you get "comfortable" with them, writing them is often very easy; It's simply easier to "construct" each bit of regex as you go, while thinking in terms of exactly what you want to match, than it is to try to divine the intent of an obscure pattern written by somebody else for some incompletely-defined purpose and left undocumented.

So, for those just getting into this, a bit of advice: Think in terms of *exactly what you want to match,* but before starting to code the regex, back away from the project for a moment, and think about it again in terms of *everything that you don't want to match.* And then thoroughly document your intent and your implementation so that when you re-visit the code next year or many years from now, you'll have a reminder.

Most flaws in regex, many frustrations, and many server upgrades and/or performance problems flow from the over-use of the "easy" but ambiguous, promiscuous, and greedy ".*" pattern, which matches anything, everything, or blank, when a much-more-specific pattern could have been written and used to avoid the trouble from the start. If you've got any patterns like "^(.*)/(.*)$ or "^(.*)-(.*)-(.*)$", you can easily optimize that -- and possibly see an immediate improvement in the responsiveness of your site. (Hint "^([^/]+/.+)$" and "([^-]+-[^-]+-.+)$" are much better, repectively, if appropriate to the application.)

Jim

mack

6:59 pm on Feb 19, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I love how it works, but hate the path to makeing it work :)

Regex is one of those things you either get it, or you don't. Sadly I fall into the 2nd group, although I have been reading up.

One day the penny will drop and it will all make sence.

Mack.

eelixduppy

4:26 pm on Feb 20, 2010 (gmt 0)



After taking a class titled "Computation and Automata" I went from love to hate...

but they are pretty useful.

blend27

1:49 pm on Feb 22, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Not sure why but for some reason I don't seem to get the concept, well not a concept but the rules of what happens and when. I've looked at a several sites out there that try to simply spell out what has already been said on the other website. I got some of the basics down. Maybe a good book with practical examples would do.

Jim, is there a realy GOOD BOOK out here?

g1smd

2:47 pm on Feb 22, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You don't need a book, you need to look at [webmasterworld.com...] more often.

:)


As well as the common overuse of the .* problem, the other problem seen is using too restrictive patterns for redirects (and therefore failing to redirect some requests that should be redirected) and too permissive patterns in rewrites (promoting duplicate content).

[edited by: lawman at 5:37 pm (utc) on Feb 22, 2010]
[edit reason] Fixed Link [/edit]

rocknbil

7:04 pm on Feb 22, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Can't remember who, but a member here was raving about never having to write regexps because of a regexp builder/program that does it for you.

I've never looked it up, but then, I do my own wrenching when the truck needs it . . . but it shouldn't be hard to find.

wheel

7:37 pm on Feb 22, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think it's like having a multiwrench - a great tool if you know how to use it. It just takes some time to learn how to use it.

I don't know regex's, one of these days I'm going to learn them. In the meantime I just pester the folks in the apache forum.

Anyango

6:43 am on Feb 23, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I thought i was alone ;) but seems like phranque and jdMorgan are alone ;)

blend27

11:05 am on Feb 23, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



g1smd, that's the whole thing I don't do Apache, nor ever wrote a line of PHP or Perl in my life.

Was recently doing a project that involved URL Rewrites on IIS6/Helicon ISAPI_Rewrite. It felt like I was trying to swim on the concrete floor, for the first time. Eventialy I figure out what I needed, but boy that was a hastle....

graeme_p

6:00 am on Feb 25, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I find simple regexps very useful, but avoid complex ones because they are difficult and bug prone.