Module 2: Grouping and Subpattern in Detail
Sometimes you will want to group certain parts of your regular expression to either capture it or repeat it, so to accomplish this, you will need to use something called a grouping operator. The grouping operators have multiple functions. They are used to group things by placing part of a regular expression inside a round bracket or parentheses. You can group that part of the regex together. This allows you to apply a quantifier to the entire group or to restrict alternation to the part of the regex.
Sub-patterns, on the other hand, will create what's called a captured group, which stores the value of that subexpression in a special variable. Regex will treat a grouped sequence as a unit, just like any other programming language which treats a parenthesized expression as a unit. Now the second function of the grouping operator is to remember and capture the sub matches, match information is normally written as an array.
Array index Zero will contain the complete match and the subsequent indexes will contain the sub matches. Here one thing to note is when the sub match is for a repeating group, only the last successful match will be remembered. I will show this in the demo part.
Now let's look at a regular expression to match an IP address as an example, I think no need to tell you that an IP before an IP address always consists of four groups of numbers separated by a little dot and that numbers are always between zero to 255. Let's build a regular expression to match an IPv4 address.
I have added some valid and invalid IP addresses in the input text. As you know that Backslash d is used to match the digit between 0 to 9. In the regex part, add backslash d. Now here I want to match a digit at least one time, but not more than three times. I need to use a reputation quantifier.
In the repetition quantifier, add 1 comma, 3 in curly brace followed by a little dot. Now next, either I can repeat it two times to match the next two numbers or I can group those in parentheses.
Then I want to repeat this preceding pattern, that grouping exactly three times. Add the reputation quantifier and lastly, we want to repeat the numeric pattern one last time, but now without a dot. As you can see that our regex is ready now, matching an IP address is a good example of a trade-off between regex complexity and exactness. This regex, of course, matching IP addresses.
But what if I add this IP address? Did you see that this is an invalid IP address, but it still matches? To restrict all four numbers in the IP address from 0 to 255, we need to use another regex. Let's write it. Here, the first branch will match the number from 250 to 255 and the second will match 202 to 249. The third branch will match anything between 0 to 199. Followed by a little dot and now this entire pattern, I want to be matched three times.
We have to use grouping to use a reputation quantifier against a complete group. Now the same numeric pattern, we need to repeat it one last time over here, but this time without a dot. Now it matches the valid IP address from the given input string.
We have a working regex and a basic understanding of how to use grouping in the regex. It's the end of module two. In this module, we had learned how to use quantifiers in regex and how they can be useful. Then we had to learn how to use shortcodes to short the regex expression. We had learned how to use anchors and boundaries. In the last slide, we learned how to do grouping and match sub-patterns in Regular Expression.
Here's RegEx video tutorial:
Other useful articles:
- How to Use RegEx for Data Extraction
- How to Find Total Tax Using a Regular Expression in C#
- How to Find a Number Using Regular Expressions in C#
- How to Find Invoice Numbers Using Regular Expressions in C#
- Find SSN Using a Regular Expression in C#
- Find Total Amount Using a Regular Expression in C#
- How to Find Website Links using Regex
- Module 1: Regular Expressions for Beginners
- Module 1: Regex Usage and Tool Demo
- Module 2: Regex Engine Basics (Part 1)
- Module 2: Regex Engine Basics (Part 2)
- Module 2: Regex Syntax in Detail (Part 1)
- Module 2: Regex Syntax in Detail (Part 2)
- Module 2: Quantifiers in Reg Ex for Beginners
- Module 2: Short Codes in Reg Ex for Beginners
- Module 2: Anchors and Boundaries in Detail
- Module 2: Grouping and Subpattern in Detail
- Module 3: Realtime Use Case of Regular Expressions - Part 1
- Module 3: Realtime Use Case of Regular Expressions - Part 2
- Module 3: Realtime Use Case of Regular Expressions - Part 3
- Module 3: Realtime Use Case of Regular Expressions - Part 4
- How to Find Quantity Field Using Regular Expression in C#
- How to Find Phone Numbers without a Specific Format
- How to Find Date Using Regular Expression in C#
- How to Find Time Using Regular Expression in C#
- How to Find a Sentence Using Regular Expressions in C#
- Find a Word Using Regular Expression in C#
- Find a Due Date using Regular Expressions in C#
- How to Find the End of a String Using Regular Expression in C
- How to Find the Start of a String Using Regular Expression in C
- How to Find a Comma using Regular Expression in C Sharp
- How to Find a Dot using Regular Expression in C
- How to Find a Semicolon using Regular Expression in C Sharp
- How to Find a Double Space using Regular Expression in C