Link Search Menu Expand Document

Module 1: Regular Expressions for Beginners

Next Tutorial

A regular expression is a miniature language used for matching complex text patterns that look mysterious at first. However, regular expressions are a powerful tool that only requires a small time investment to learn. It should be to have in your toolbox if used correctly and appropriately. In this course some of the major topics we will cover include what regular expressions are exactly and why we can’t ignore them.

Then we will learn about what is the difference between simple string search and regular expression. We will be looking at the tools provided by ByteScout which is PDF Multitool. Please note that we will perform and test all our regular expressions using this utility throughout this course. Then we will see how a regular expression engine works internally and then we will learn about regular expressions, important Metacharacters, character classes, Quantifiers, and Anchors.

Then in the last module, we will see some real-time use cases of regular expressions. By the end of this course, you will have a piece of sound knowledge to really master regular expression fundamentals.

Introduction to Regular Expressions

According to Wikipedia, a regular expression is a sequence of characters that define a search pattern. You can say it is a special text string for describing a search pattern. It is a pattern that the regular expression engine attempts to match in the input text. The regular expressions are also known in short form as Regex or Regexp. I prefer Regex because it is easy to pronounce. You are probably familiar with the wildcard notation, such as star dot text to find all text files in a file manager. But you can do much more with regular expressions.

A basic understanding of regular expression will make your life a whole lot easier when you do. Of course, we will go into the details of Regex. But just to get the little introduction here, I have defined two Regex examples. In the first regex example, I want to find or match both the word Byte and Bite, it means it will find both the spellings of the word byte in one operation instead of two.

This is how it differs from a normal string search. In the second regex, which is an example of a complex pattern, I want to match a password with some rules like password must be six to 12 characters in length. It must have at least one uppercase letter and it must have one lowercase letter, etc.

Getting Started Image

Any non-trivial regex looks ugly to anybody who is not familiar with them. But with this course, I’m sure you will soon be able to craft your own regular expression as you have never done anything else. Now you think that I can do the same thing without Regex also like fine string, replace string, any kind of finding this up a string.

How normal string search differs from Regex? Generally, string searching involves one source string from which we want to find and another string that we want to find from the source string. Let’s call it the find word. Now, the goal is to find one or more occurrences of the find word within the source string. Now here one might request the first occurrence of “to”, which is at the 11th word.

All occurrences of which there are four and the last occurrences of “to”, which is at the 21st position. Though there are many algorithms for string search, in a simple way, it works like this. Check the occurrences of the find word one by one in nested loops.

First, we try to match if there is a copy of the find word to the first character of the source string. If not, then it tries to match if there is a copy of find word to the second character. If not, then it tries to look at the third character and so on and so forth until it finds a match.

Consider the situation, assume that I have a mailing list contains names that sometimes include a title like Mr. Mrs. Miss, along with the first name and last name. Now, I want to replace or remove the title with the empty string. In normal cases, I need to use three times to replace function to replace each occurrence of title.

Getting Started Image

This regular expression pattern matches any occurrences of Mr. Mister with DOT, Mrs. Miss, etc. Then call to Regex dot replace method replaces the match string with an empty string. In other words, it removes it from the original string.

Web API for developers Free Trial Offline SDK

Here's RegEx video tutorial:

Other useful articles:

Back to top

© , — All Rights Reserved - Terms of Use - Privacy Policy