My book, Regular Expression Pocket Reference, has sold well over 30k copies and I’m constantly surprised how often I talk to someone who claims to have a copy of the book on their desk. The thing about that book, though, is that I’m not nearly smart enough from a nuts/bolts or math angle to be qualified to write it. I muddled through, and with the help of amazing tech reviewers and a lot more work than it should have taken, the end result is a pretty good book.
However, by virtue of not starting out as a regex expert, I have a lot more empathy for the every-day coder who just wants to get these suckers to work. So, once the book was published I started working on tips for every day use.
Here’s one of my favorites, a presentation on Regular Expression Best Practices. I think I gave this at a Perl Mongers meeting a few years ago. Excuse the Perl code, all of the ideas are universal.
The basic premise of the presentation is that regular expressions are inherently difficult to write, maintain, and get right, but that we could do much better if we applied a few simple (best) practices.
Here are the inherent reasons:
A.) They have a crummy, terse syntax.
B.) We (normal programmers) don’t use them enough to become proficient.
C.) They are applied to some dirty, hard-to-verify (that’s why we’re writing the regex) data.
Given that, we (normal programmers) then choose to ignore the normal practices of programming, practices that we use reliably with expressive clear languages that we are experts in. The presentation identifies those normal practices and then calls them regex best practices: use white space, code structure, and code verification/testing. Plus, the presentation has one of my favorite security gotchas, a favorite quote, and some common regex mistakes.
StumbleUpon
Facebook
Twitter
HackerNews
Tumblr