Regex - match a string that doesn't contain another string
at: 2:44 PM | Filed under: regular expressions
User Greg Morphis asked a question today over on the House of Fusion mailing list.
I just ran into a problem with some old code one of team mates did. He used <cfqueryparams but did not specify a cfsqltype. We upgraded our DB from Oracle 9i to 10g and all of the sudden we're getting Error Executing Database Query errors. Logs show A nonnumeric character was found when expecting a numeric character. One of the queries had a date column and the cfqueryparam looked like "<cfqueryparam value="foo" /> no cfsqltype and according to the docs it's default is CF_SQL_CHAR. So I need to go through the 1321 <cfqueryparams and look for ones with no cf_sql_type. Is there a regex I can throw into Eclipse to help find these queries?
I tried a few things but couldn't get a good match so I asked about it on Twitter. A few minutes later Jason Dean, of 12 Robots fame, came back to me with a great answer that worked perfectly. Here's the test strings, and the regex he came up with. After that we'll dissect the regular expression
<cfqueryparam value="foo" />
<cfqueryparam value="foo">
<cfqueryparam value="#foo#">
<cfqueryparam value="foo" cfsqltype="cf_sql_integer" />
<cfqueryparam cfsqltype="cf_sql_integer" value="foo" />
<cfqueryparam(.?[^cf_sql_type])+?>
-
- <cfqueryparam static string
- ( begins a capturing group
- . matches any single character
- ? matches the preceding character 0 or 1 times
- [^cf_sql_type] square brackets make a character set, but the caret ^ inverts that. This section now matches everything BUT the string cf_sql_type
- ) ends a capturing group
- + matches the preceding token or set 1 or more times. Greedy, matching as many characters as possible
- ? when immediately following a +, it converts it to a lazy match, matching as few characters as possible
By the way...if you see any errors in this regex, or have a suggestion that's more elegant or would work better, feel free to post it.
Update: On a side note, Grant Skinner has an excellent regex tester on his website.

Hi Andy Sorry you're mistaken as to how [^cf_sql_type] works. That matches any single character that is not any of "c,f,_,s,q,l,t,y,p,e". It does not match an absence of "cf_sql_type". The square brackets indicate matches of *each* character within them, not the sequence of characters within them. It's just coincidence that this regex works for you. It'll happy match "", for example.
However your requirement should be doable with a negative look-ahead of "cfsqltype" between a "".
HTH.
--
Adam
Adam... Thanks for that...the problem is that I want to be able to match a string. I've used parens for grouping a string literal in a regex but it seems like it's inconsistent.
Yeah, Adam is right. In my defense, when I originally sent the regex to Andy, I did have [^(cfsqltype)] with parentheses wrapping the group, hoping that that would no longer treat them as individual characters. Unfortunately, I see now that those are ignored. I spent a little more time on it this morning, playing with negative look-ahead, which I admit I do not fully understand. But I came up with this:
This seems to work in REMatch and in Eclipse's file search. What do you think?
Oops. Let's try again. <cfqueryparam(((?!cfsqltype).))+?>
Argh I used to have a regex lying around ages ago which did this very thing (well: for me it was finding <cffunction> tags without returntype attributes, but samesame). However I: a) cannot find it; b) cannot - for the life of me - reconstruct it. c) find anyone who's done a similar thing on Google. Maybe drop Ben Nadel a line... he's pretty good with regexes and enjoys a challenge... -- Adam