An Inconvenient Glark: Regular ExpressionsNothing can turn a programmer's mind into tapioca pudding faster than trying to understand other people's regular expressions (OPREs). Friends, there is nothing regular about regular expressions. The quiet secret -- seldom shared outside of computer nerd circles -- is that regular expressions are commonly used to validate values and provide protection from malicious user input. How did we get to the place where we trust code that few people other than the author can understand? Dunno. Most of the time we figure out OPREs based on code comments. Or in hacker speak, we glark the meaning from the comments. Let me illustrate how ugly regular expressions are with a few examples. Here are some gems from the Regular Expression Library. ^(([0-9])|([0-1][0-9])|([2][0-3])):(([0-9])|([0-5][0-9]))$
What does this code do? It matches time in 24 hour format. For example these values match: 1:59, 01:59 and 23:59. These values do not match: 12:63, 25:60 and 13.10. Obvious, right? ^(?:\([2-9]\d{2}\)\ ?|[2-9]\d{2}(?:\-?|\ ?))[2-9]\d{2}[- ]?\d{4}$
Doesn't this look like something you'd see scrawled across a blackboard in an episode of Numb3rs? This code matches ten digit US phone numbers. These values match: 2225551212, 222 555 1212, 222-555-1212, (222) 555 1212 and (222) 555-1212. /^([a-z0-9])(([\-.]|[_]+)?([a-z0-9]+))*(@)([a-z0-9])((([-]+)?([a-z0-9]+))?)*((.[a-z]{2,3})?(.[a-z]{2,6}))$/i
Feel like curling up in the fetal position yet? This expression matches valid email addresses. My advice for developers is to document regular expressions with clear comments. This will help your teammates preserve their sanity. Posted at 4:47 PM
|
Search
Recent Entries
Merry Christmas 2008
Video Games That are Freely Available Host a LAN Party Mines of Kevos For the Veterans Deep and Wide Heroic Adventures Glad That I Was Born Creative Limit Archon 32 Recap (Older Entries) Other Journals
License
RSS Feed
|