Regular Expressions

 

 

What's covered?

See Useful Links to find sites that cover regular expressions in full.

What Are Regular Expressions?

Regular expressions can be described as a formula to find a text string and are used where an ordinary search will not do. For example, you know a hyperlink starts <a HREF and ends .htm"> but what's in the middle can vary. A regular expression can be written to find all hyperlinks.

See Useful Links to access sites with more information. This topic is simply a table setting out some of the characters used.

Anchors

Character

Example

Matches

^

^Word

The match must occur at the beginning of the line.
In the example, the pattern matches strings where the line starts with "Word".
Do not confuse with [^] below.

$

Word$

The match must occur at the end of the line.
In the example, the pattern matches strings where the line ends with "Word".

\<

\<Word

The match must occur at the beginning of the string.
In the example, the pattern matches strings that start with "Word".

\>

\>Word

The match must occur at the end of the string.
In the example, the pattern matches strings that end with "Word".

\b

\bWo

rd\b

Defines the boundary for the word.
The first example matches any word in the expression that starts with "Wo".
The second example matches any word in the expression that ends with "rd".

\B

\BWo

rd\B

Ensures the match is not on the boundary of a word.
The first example matches any word in the expression that contains "Wo", but doesn't start with "Wo".
The second example matches any word in the expression that contains "rd", but doesn't end with "rd"

Pattern Matching

Character

Example

Matches

?

L?

Matches zero or one time. In the example, the pattern matches a single "L"if present, otherwise an empty string.

.

..

Matches any character except new lines. In the example, the pattern matches the first two characters.

\

\d

Used to find literal or special characters.
In the example, the pattern matches a digit.

\d

 

Used to match any single digit between 0 and 9.

\D

 

Used to match any non-digit.

\s

 

Used to match any single white-space characters.

\S

 

Used to match any single non white-space character.

\w

 

Used to match any character, digit or underscore.

\W

 

Used to match anything other than a character, digit or underscore.

\xnn

\x41

Used to match the ASCII character represented by the hexadecimal number nn. The ASCII character for 'A' is 65. The hexadecimal value is 41.
The example matches the first occurence of an A.

[]

[aeiou]

Used to match any of the enclosed characters.

[^]

[^aeiou]

The first match that is not in the enclosed characters.

Do not confuse with ^ above.

[c-c]

[a-z]

Used to match a range of characters. The example matches the first lowercase character.

()

(em)

Brackets are used for grouping. The value found is stored, and may be used later for referencing. This technique is called backreferencing.

|

a|e

Matches either of the character to the side of the operator. In the example, the first match of either an a or an e is found.

Quantifiers

Character

Example

Matches

*

L*

Matches zero or more times.
In the example, the pattern matches a string of L's if they are present, otherwise an empty string.

+

L+

Matches one or more times.
In the example, the pattern matches a string of L's. If no L's are present, a match will not be found.

{n}

w{3}

Used to match n occurences of the previous character.
The example matches "www" if three or more W's are found together in the string.

{n,}

w{3,}

Used to match n occurences of the previous character.
The example matches "www" if three or more W's are found together in the string.

{m,n}

w{3,5}

Used to match between m and n occurences of the previous character. The example tries to match between three and five w's.

Donations

If you find the information and tutorials on my site save you time figuring it out for yourself and help improve what you produce, please consider making a small donation.

Topic Revisions

Date

Changes to this page

20 Feb 2017

Topic reviewed. No changes made.

08 Aug 2004

Brief introduction added.