Link Search Menu Expand Document

Module 3: Realtime Use Case of Regular Expressions for Beginners - Part 1

Previous Tutorial - Next Tutorial

US zip code or the US Postal Code allows both the five-digit format and nine-digit formats, but here the last four digits including the hyphen are optional. So without thinking much, let’s write the regex for this format. I have added some sample zip codes in the input text. We want to match digits. To match the digit, we have to use backslash d, but before that,  assert the position at the beginning of the string by using the carrot meta character.

In the US zip code, the first set consists of five digits. That means we want to match digits exactly five times. For that, we need to use a repeating quantifier and the second set consisting of a hyphen and four-digit character. But as I say, the second set which consists of four digits including a hyphen is optional. Add in one group by using parentheses and use question mark characters to make it optional and assert the end of the string by using the dollar sign character. This is the working regex to find the zip code.

The first set allows or matches five digits and for that, we have used this expression and then that can be an optional four digits number, including the hyphen. But this part is optional. We have added it in one group and make it optional by using this expression.

RegEx Use Cases

Now find the zip using this regex in PDF multi-tool. Open PDF multi-tool and open one sample PDF file. In this PDF file, the zip code is defined over here. To find it using regex, click on this link and it will open this pop-up. Check this checkbox so that we can use regular expressions. Copy-paste regex which we have created over here. Click on this find next button. As you can see that, this tool is now matching the zip code over here.

Now let's move to the next demo. Let's say you have given a requirement to check whether a user has entered a valid social security number in your application or not. This social security number is of nine-digit numbers in this format. The first three digits are called area numbers, the next two digits are called group numbers and the last four digits are serial numbers, which range from triple zero one to four times nine. Now write the regex to match the SSN. I have added some simple SSNs in the input text.

Here again, we want to match the digit as an SSN consisting of digits. Use the backslash d over here and add the meta character carrot sign over here to define the starting position. In the SSN, the first set consists of three digits. Use the repeating quantifier followed by a hyphen, the second set matches a two-digit number, for that use, this expression, followed by a hyphen, and the last set matches four digits. Again, backslash d in the repeating quantifier, add four, and end of the string. This way we have a working regex to match the SSN.

RegEx Use Case

Now let's try to understand this expression. Apart from this metacharacter, carrot sign, and dollar sign, which assert the position at the beginning and the end of the string, this regex can be broken into three sets separated by a hyphen.

The first set allows or matches three digits, followed by a hyphen. Then there is a second set that matches two-digit numbers and the last set matches a four-digit number. Find the SSN using this same regex in PDF multi-tool.

In the PDF multi-tool, I have already loaded one simple PDF file which contains the SSN over here. Click on this link and paste our SSN regex over here.

First, SSN is matching and when I click again on this link, it will be highlighted. 3rd and fourth SSN also highlighted over here. This way we can find the SSN in a PDF file using this tool.

Web API for developers Free Trial Offline SDK

Here's RegEx video tutorial:

Other useful articles:

Back to top

© , — All Rights Reserved - Terms of Use - Privacy Policy