Navigation Logo 7.1  Theory and Practice Navigation Logo

 

 

Regular expressions come from a field called formal language theory. In this theory, a language is a set of strings and one's efforts are divided between defining languages, on one hand, and developing algorithms to recognize which strings belong to a defined language, on the other hand.

The word "language" is used differently in this section than in other sections. Here it simply means a set of strings – nothing more.

Although the presentation in the following sections is self-explanatory, it is helpful for many readers to give a little thought to the differences between the theory they have seen and the practice that is explained in this chapter. Here are some differences between theory and practice as they apply to regular expressions.

  1. The theory is concerned with recognizing whether a given string belongs to a language that has been defined with a regular expression. The practice is not concerned with whether the given string belongs to the language, but with determining whether a substring does. If such a substring exists, it is said to match the regular expression. In any case, the regular expression is said to be a pattern which a substring may match.

  2. When multiple substrings can be found in the given string, it is often important to know which substring the finite automaton will find. This problem simply does not arise in the theoretical setting.

  3. The theory is presented with typographical tricks to distinguish between symbols of the underlying language and symbols of the regular expressions that define the language. The practice (so far) lives in a world where we are restricted to what we see on a typewriter keyboard.

  4. The theory assumes the strings of the language are made from a set of symbols about which nothing further need be said. In practice, it is necessary to say something about the symbols. Partly this is because of point 3, partly this is because we need a way of representing characters that do not appear on the keyboard or the screen, and partly this is because we need a shorthand for representing important subsets of the symbols, for example, the alphabetic letters.

Remark

Often, when we see theory in class and a related-but-different practice in the real world, we decide that the theory is not very useful. Sometimes that is true. This is not one of those times. Without formal language theory, this chapter would have nothing to say.

What makes the theory seem irrelevant is that it must be presented in the classroom as simply as possible. Imagine what studying the theory would have been like if you had needed to deal with the complications listed below. Be glad that others have dealt with them for you.

 

 

[Sample TK Application]
Author's Home Page
Navigation Logo [Book's Cover]
Order from Amazon.