Link Search Menu Expand Document

Module 2: Grouping and Subpattern in Detail

Previous Tutorial - Next Tutorial

Sometimes you will want to group certain parts of your regular expression to either capture it or repeat it, so to accomplish this, you will need to use something called a grouping operator. The grouping operators have multiple functions. They are used to group things by placing part of a regular expression inside a round bracket or parentheses. You can group that part of the regex together. This allows you to apply a quantifier to the entire group or to restrict alternation to the part of the regex.

Sub-patterns, on the other hand, will create what's called a captured group, which stores the value of that subexpression in a special variable. Regex will treat a grouped sequence as a unit, just like any other programming language which treats a parenthesized expression as a unit. Now the second function of the grouping operator is to remember and capture the sub matches, match information is normally written as an array.

Array index Zero will contain the complete match and the subsequent indexes will contain the sub matches. Here one thing to note is when the sub match is for a repeating group, only the last successful match will be remembered. I will show this in the demo part.

Now let's look at a regular expression to match an IP address as an example, I think no need to tell you that an IP before an IP address always consists of four groups of numbers separated by a little dot and that numbers are always between zero to 255. Let's build a regular expression to match an IPv4 address.

RegEx Grouping

I have added some valid and invalid IP addresses in the input text. As you know that Backslash d is used to match the digit between 0 to 9. In the regex part, add backslash d. Now here I want to match a digit at least one time, but not more than three times. I need to use a reputation quantifier.

In the repetition quantifier, add 1 comma, 3 in curly brace followed by a little dot. Now next, either I can repeat it two times to match the next two numbers or I can group those in parentheses.

Then I want to repeat this preceding pattern, that grouping exactly three times. Add the reputation quantifier and lastly, we want to repeat the numeric pattern one last time, but now without a dot. As you can see that our regex is ready now, matching an IP address is a good example of a trade-off between regex complexity and exactness. This regex, of course, matching IP addresses.

RegEx Subpatterns

But what if I add this IP address? Did you see that this is an invalid IP address, but it still matches? To restrict all four numbers in the IP address from 0 to 255, we need to use another regex. Let's write it. Here, the first branch will match the number from 250 to 255 and the second will match 202 to 249. The third branch will match anything between 0 to 199. Followed by a little dot and now this entire pattern, I want to be matched three times.

We have to use grouping to use a reputation quantifier against a complete group. Now the same numeric pattern, we need to repeat it one last time over here, but this time without a dot. Now it matches the valid IP address from the given input string.

We have a working regex and a basic understanding of how to use grouping in the regex. It's the end of module two. In this module, we had learned how to use quantifiers in regex and how they can be useful. Then we had to learn how to use shortcodes to short the regex expression. We had learned how to use anchors and boundaries. In the last slide, we learned how to do grouping and match sub-patterns in Regular Expression.

Web API for developers Free Trial Offline SDK

Here's RegEx video tutorial:

Other useful articles:


Back to top

© , Regexsonline.com — All Rights Reserved - Terms of Use - Privacy Policy