Link Search Menu Expand Document

Module 2: Regex Syntax in Detail (Part 2)

Previous Tutorial - Next Tutorial

What happens if we want to match a special character from within a character class, for example, the carrot sign, dollar sign or the period dot or the opening square bracket or backslash, which is also called metacharacters. The simplest way to make this special character is to escape them by simply adding the backslash before that character.

Character ranges generally follow the character order of the local. Here what I mean to say is, let's look at this ASCII character table. As you can see, if you use capital A to capital P as the range, all is fine. The correct group of characters is selected. However, if I change the range to go from capital A to lowercase D, the characters between the capital Z in the lower case E are also selected. Obviously, this is just a simple example, but be careful when you define your character ranges in your regular expression using this character class.

Regular Expression Syntax

Another thing to be wary of is the order in which you give the characters to define a range. For example, if I define my range as zero to nine, then no problem as it is a valid range. However, if I define the ranges as nine to zero, the behavior is not completely defined. That means your Regex is considered either invalid or your Regex engine might throw an error. So be careful when you define the character order within the Regex.

In some of the card games the Joker is a wild card and used to represent any other playing card, but with certain restrictions. Same way in regular expression there is a concept of wild card which is represented by the dot metacharacter. The dot will generally match anything like alphabetic or numeric character or whitespace punctuations and any other symbols except the new line character. When using a dot within a character class, it loses its special meaning and matches just a little dot. Here this [a.c] matches only either A or Dot or C, but if I remove this character class, then it matches ABC or ADC, etc.

Using a dot within a character class as a wildcard would be kind of senseless anyway as it already matches everything. This is the end of part one of module two. In the next part, we will learn about quantifiers, metacharacter alternation, sub-pattern, and grouping.

Web API for developers Free Trial Offline SDK

Here's RegEx video tutorial:

Other useful articles:


Back to top

© , Regexsonline.com — All Rights Reserved - Terms of Use - Privacy Policy