Module 1: Regular Expressions for Beginners
A regular expression is a miniature language used for matching complex text patterns that look mysterious at first. However, regular expressions are a powerful tool that only requires a small time investment to learn. It should be to have in your toolbox if used correctly and appropriately. In this course some of the major topics we will cover include what regular expressions are exactly and why we can’t ignore them.
Then we will learn about what is the difference between simple string search and regular expression. We will be looking at the tools provided by ByteScout which is PDF Multitool. Please note that we will perform and test all our regular expressions using this utility throughout this course. Then we will see how a regular expression engine works internally and then we will learn about regular expressions, important Metacharacters, character classes, Quantifiers, and Anchors.
Then in the last module, we will see some real-time use cases of regular expressions. By the end of this course, you will have a piece of sound knowledge to really master regular expression fundamentals.
Introduction to Regular Expressions
According to Wikipedia, a regular expression is a sequence of characters that define a search pattern. You can say it is a special text string for describing a search pattern. It is a pattern that the regular expression engine attempts to match in the input text. The regular expressions are also known in short form as Regex or Regexp. I prefer Regex because it is easy to pronounce. You are probably familiar with the wildcard notation, such as star dot text to find all text files in a file manager. But you can do much more with regular expressions.
A basic understanding of regular expression will make your life a whole lot easier when you do. Of course, we will go into the details of Regex. But just to get the little introduction here, I have defined two Regex examples. In the first regex example, I want to find or match both the word Byte and Bite, it means it will find both the spellings of the word byte in one operation instead of two.
This is how it differs from a normal string search. In the second regex, which is an example of a complex pattern, I want to match a password with some rules like password must be six to 12 characters in length. It must have at least one uppercase letter and it must have one lowercase letter, etc.
Any non-trivial regex looks ugly to anybody who is not familiar with them. But with this course, I’m sure you will soon be able to craft your own regular expression as you have never done anything else. Now you think that I can do the same thing without Regex also like fine string, replace string, any kind of finding this up a string.
How normal string search differs from Regex? Generally, string searching involves one source string from which we want to find and another string that we want to find from the source string. Let’s call it the find word. Now, the goal is to find one or more occurrences of the find word within the source string. Now here one might request the first occurrence of “to”, which is at the 11th word.
All occurrences of which there are four and the last occurrences of “to”, which is at the 21st position. Though there are many algorithms for string search, in a simple way, it works like this. Check the occurrences of the find word one by one in nested loops.
First, we try to match if there is a copy of the find word to the first character of the source string. If not, then it tries to match if there is a copy of find word to the second character. If not, then it tries to look at the third character and so on and so forth until it finds a match.
Consider the situation, assume that I have a mailing list contains names that sometimes include a title like Mr. Mrs. Miss, along with the first name and last name. Now, I want to replace or remove the title with the empty string. In normal cases, I need to use three times to replace function to replace each occurrence of title.
This regular expression pattern matches any occurrences of Mr. Mister with DOT, Mrs. Miss, etc. Then call to Regex dot replace method replaces the match string with an empty string. In other words, it removes it from the original string.
Here's RegEx video tutorial:
Other useful articles:
- How to Use RegEx for Data Extraction
- How to Find Total Tax Using a Regular Expression in C#
- How to Find a Number Using Regular Expressions in C#
- How to Find Invoice Numbers Using Regular Expressions in C#
- Find SSN Using a Regular Expression in C#
- Find Total Amount Using a Regular Expression in C#
- How to Find Website Links using Regex
- Module 1: Regular Expressions for Beginners
- Module 1: Regex Usage and Tool Demo
- Module 2: Regex Engine Basics (Part 1)
- Module 2: Regex Engine Basics (Part 2)
- Module 2: Regex Syntax in Detail (Part 1)
- Module 2: Regex Syntax in Detail (Part 2)
- Module 2: Quantifiers in Reg Ex for Beginners
- Module 2: Short Codes in Reg Ex for Beginners
- Module 2: Anchors and Boundaries in Detail
- Module 2: Grouping and Subpattern in Detail
- Module 3: Realtime Use Case of Regular Expressions - Part 1
- Module 3: Realtime Use Case of Regular Expressions - Part 2
- Module 3: Realtime Use Case of Regular Expressions - Part 3
- Module 3: Realtime Use Case of Regular Expressions - Part 4
- How to Find Quantity Field Using Regular Expression in C#
- How to Find Phone Numbers without a Specific Format
- How to Find Date Using Regular Expression in C#
- How to Find Time Using Regular Expression in C#
- How to Find a Sentence Using Regular Expressions in C#
- Find a Word Using Regular Expression in C#
- Find a Due Date using Regular Expressions in C#
- How to Find the End of a String Using Regular Expression in C
- How to Find the Start of a String Using Regular Expression in C
- How to Find a Comma using Regular Expression in C Sharp
- How to Find a Dot using Regular Expression in C
- How to Find a Semicolon using Regular Expression in C Sharp
- How to Find a Double Space using Regular Expression in C