| 7.7 Search and Replace |
|
|
Tcl has one more command that deals with regular expressions. It is
REPLACE_PATTERN may be a simple string you want substituted someplace inside STRING. For example, regsub -all dog $Script cat ScriptThis command replaces all instances of "dog" in Script with "cat" and puts the resulting string back into Script. As with glob and regular-expression patterns, REPLACE_PATTERN may contain special characters that alter its meaning. This is a third kind of pattern you must learn. Happily, it is much easier than the other two. There are two special characters, & and \. If REPLACE_PATTERN contains the special character & then the & stands for the entire substring that was matched. So, regsub -all cat $Str (&) Strwill put parentheses around all occurrences of "cat."
More complicated substitutions are possible by identifying substrings that match subpatterns of PATTERN and using those substrings to build the replacement pattern. Substrings are identified with parentheses in PATTERN the same way they are for the regexp command. Suppose we have a line with a date in the form MONTH/DAY/YEAR you want to put it into the form YEAR-MONTH-DAY. For example, you want "06/23/96" to become "96-06-23." I am simplifying this problem a little by assuming that the month, the day, and the year all contain exactly two digits. Using the preassigned pattern, Digit_, a date has this pattern: ($Digit_+)/($Digit_+)/($Digit_+)and the three sets of parentheses identify the numbers you need. This use of regexp would extract the month, day, and year.
regexp "($Digit_+)/($Digit_+)/($Digit_+)" $Line \
Junk Month Day Year
The replacement string you want is $Year-$Month-$DayHowever, you cannot write it that way when using regsub. You do not get to name the variables that match the parentheses. Instead, the substrings that match subpatterns are represented in REPLACE_PATTERN with \1, \2, and so on. The subpattern represented inside the leftmost parentheses is represented with \1, and so on to the right. Here is the complete regsub command for the date transforming example.
regsub "($Digit_+)/($Digit_+)/($Digit_+)" $Line \
\\3-\\1-\\2 Line
Note REPlACE_PATTERN is not a regular expression and I do not have
any style rules for writing it. As shown here, REPLACE_PATTERN is
interpreted first by the Tcl interpreter and then by regsub. The first
interpretation replaces each \\ with \. The second replaces each
\i with the corresponding matched substring.
Finally, you should know that regsub will do backslash quoting to permit you to have things like & and \ in your patterns. Also, regsub will treat \0 just like &.
|
Author's Home Page |
|
Order from Amazon. |