| 7.6 Use Parentheses to Extract Subpatterns |
|
|
While parentheses can permit you to write more complicated regular expressions, their main purpose may be to let you extract substrings from a matching string. Suppose you have some document in which negative numbers are represented by being placed inside parentheses, for example, (45.32) (2.94). All numbers have two digits to the right of the decimal point. There may, or may not, be any digits to the left of the decimal point. Here is a pattern to match those numbers. $LParen_$Number_$RParen_The pattern relies on these preassigned subpatterns:
set LParen_ {\(}
set RParen_ {\)}
set Digit_ {[0-9]}
set Dot_ {\.}
set Number_ $Digit_*$Dot_$Digit_$Digit_
Now, suppose you want to search for a parenthesized negative number, extract the nonnegative number in the parentheses, and make it negative. There is a variation of regexp that will help:
To extract the number part of the previous pattern, we need to put parentheses around it, something like this: $LParen_($Number_)$RParen_ The variable LParen has been defined so that it will match a left parenthesis and not be seen as a special symbol by regexp. Unfortunately, the string shown above is a case where the left parenthesis can also be a special symbol for Tcl. When interpreting the string, Tcl thinks LParen_ is being used as an array! Tcl has ways of handling this problem. The left parenthesis could be protected with a backslash, or the variable name could be delineated with curly brackets. Using the second trick, the regexp command looks like this:
regexp ${LParen_}($Number_)$RParen_ $Text Junk Number
If a match is found, Junk will contain the entire matching substring,
which we do not care about, and Number will contain the desired number.
|
Author's Home Page |
|
Order from Amazon. |