Learning Resources

Regular Expressions

a regular expression provides a concise and flexible means to "match" (specify and recognize) strings of text, such as particular characters, words, or patterns of characters. Common abbreviations for "regular expression" include regex and regexp.

The following are examples of specifications which can be expressed as a regular expression:

  •     the sequence of characters "car" appearing consecutively, such as in "car", "cartoon", or "bicarbonate"
  •     the word "car" when it appears as an isolated word (and delimited from other words, typically through whitespace characters)
  •     the word "car" when preceded by the word "motor" (and separated by a named delimiter, or multiple.)

A regular expression, often called a pattern, is an expression that specifies a set of strings. To specify such sets of strings, rules are often more concise than lists of a set's members. For example, the set containing the three strings "Handel", "Händel", and "Haendel" can be specified by the pattern H(ä|ae?)ndel (or alternatively, it is said that the pattern matches each of the three strings). In most formalisms, if there exists at least one regex that matches a particular set then there exist an infinite number of such expressions. Most formalisms provide the following operations to construct regular expressions.