Regex unit test suite?

I’m reading Mastering Regular Expressions, by Jeffrey Friedl.  Regular expressions come in a lot of different flavors and dialects.  In reading the book, I realized that when I’ve used grep for text searching, sometimes my regex has failed because I was using the + metacharacter, which grep doesn’t support! (I’m using Cygwin’s GNU grep 2.5.3)

Wouldn’t it be nice to have a regex unit test suite that you could run a utility against and see for certain what metacharacters it supports?  I’m envisioning something sort of like a “configure” script, except instead of storing configuration settings it would just print them to the screen.

Some settings that might be useful:

  • Does this tool support the + metacharacter?
  • For grouping, should I use ( ) or \( \) ?
  • Does this tool support the {min,max} (or \{min,max\}) syntax?

[Update 1/15/2009: I’m now in Chapter 4 of Jeff Friedl’s Mastering Regular Expressions, and by now I know of other this I’d like to test:

  • Lazy quantifiers: ??, *?, +?, {max,min}?
  • Possessive quantifiers: *+, ++, ?+, {min,max}+
  • Atomic grouping: (?>…)
  • Which kind of regex engine does the tool use: Traditional NFA, DFA, or POSIX NFA?

]

Though I’m calling it a suite, probably a fairly monolithic single file o tests would be sufficient.  It seems that separate version of the suite would need to be made for each language, but all the same tests would be there in each version…

Has anyone done something like this already, I wonder?

Advertisements

  1. #1 by Mat Kramer on April 20, 2009 - 10:44 am

    Did you ever find anything like this? I’m curious too.

  2. #2 by danielmeyer on April 20, 2009 - 10:56 am

    The closest thing I’ve seen so far is the asserts in Visibone’s JavaScript Regular Expressions quick reference.

  3. #3 by Chip Camden on September 9, 2009 - 4:01 pm

    Thanks for that link. I’m planning to use it to build tests for a new regex parser I’m working on.

  4. #4 by danielmeyer on September 11, 2009 - 5:09 pm

    Cool, glad it helped! Thanks for saying so.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s