Link Search Menu Expand Document

Module 2: Regex Syntax in Detail (Part 1)

Previous Tutorial - Next Tutorial

First of all, let’s start with the little characters, a regular expression matches a broad or a specific text pattern and is read from left to right, which we already know that the first thing to recognize when using a regular expression is that everything is essentially a character and we are writing a regex to match a specific sequence of characters.

Regular expressions can contain both special and little characters. Little characters like uppercase characters from A to Z, which simply match themselves like this(ABCDEFGHIJKLMN..XYZ), a lowercase character from A to Z, which simply match themselves like this (abcdefghijklmnopq...xyz), and the numbers 0 to 9, which simply match themselves like this(123456789). They are the simplest regular expression. In reality, there will be more appropriate functions available to do just that.

You may now think that what about the matching control characters like new Line, Carriage Return, or Tab. You can just add those to your regular expression in the same way as literals. Let us see some control characters like Slash n(\n) which matches a new line if exists in the input string slash R (\r) which matches a carriage return. Slash T (\t) which matches a tab. Slash a (\a) which matches a bell character, slash B(\b) which matches a backspace. Slash F(\f) which matches a form feed. Now let's see one very simple demo for a new line character.

RegEx Syntax

This is another good online regular expression testing tool. First, define the input text over here. For example, test one. Now here I am hitting the enter key, which is a new line, now in the next line define another word in the next line another word. Here in the regular expression, what happens if I write a slash n (\n)? As you can see here, this Slash N will match the new line in this input text.

Let's move to the next concept. A character class is a group of one or multiple characters with a character class, you can tell the Regex engine to match only one out of several characters. Simply plays the characters you want to match between the square brackets. For example, the pattern ABCDE to H will match any single character from A to H. We may have special characters or rather characters that have special meaning within a regular expression.

In this module, we will go through each of the metacharacters to see exactly what they do and how you can use them to build effective regular expression. You can add individual characters to a character class or you can use the dash character to indicate a range of characters. Using a character class is a way to see any of these characters is considered a match. Basically, this pattern says match any single characters from A to H. Same as this previous one. You can also combine several number ranges and literals in one character class. Another great feature is that we can invert the behavior of character class by negating them. We do this by adding a carrot character as the first character in the class like this[^abcdefgh]. The class now means match anything except the characters contained in the class.

Let's perform one demo to understand this concept. I have defined the input text. I want to match the word which ends with this at the character. Here I want to match the words like bat, cat, dat, etc. Now let's write the regular expression. First, define the character class and range. For example, A to F and then at which is like [a-f]at. As you can see here that this Regex now started matching the word bat, cat, dat. What happens if I write the word FAT? This word will also match because this word falls within this character range and also satisfies this condition. What happens if I write the word mat and the same way the word nat.

RegEx Syntax in Details

So why is this word not highlighting? Because this word does not fall within this[a-f]range. What happens if I write only B here? As you can see there now, this time this Regex will match only this bat word.

What happens if I add the carrot sign over here? Now the result is the opposite. This time this Regex matches all the words and with at except this bat word. Let's take another example. This time we have a price list from one real estate website and now we want to match the prices from this given list. It means we want to match the numbers and this dollar sign. I need to write Regex and define the character, class and then dollar. Now define the number range 0 to 9.

Here as you can see that this regex now started matching only the numbers with a dollar sign. What happens if I negating the result means what happens if I add the carrot sign? The result is totally inverted. It's matching any character except within this [^$0-9]character class.

Web API for developers Free Trial Offline SDK

Here's RegEx video tutorial:

Other useful articles:


Back to top

© , Regexsonline.com — All Rights Reserved - Terms of Use - Privacy Policy