Link Search Menu Expand Document

Module 2: Short Codes in Reg Ex for Beginners

Previous Tutorial - Next Tutorial

Shortcode(s) or shorten codes are meta characters to shorten the regular expression by using a single metacharacter to represent several characters. There are many shortcodes available that can be used to represent a wide variety of things, such as white space, which includes tabs, spaces, newlines, and so forth. Other shortcode metacharacters are used to represent digits or characters. In short, shortcode metacharacters are used to represent more than one character or a range of characters.

The most common shortcode is backslash d, which is a little bit more straightforward. This metacharacter is used to only match digits from 0 to 9, and the next shortcode meta character is backslash capital D, which does opposite things from the above meta character. It means it matches anything which is not a digit. You can achieve the same thing by adding the negative character in the number range like this [^0-9].

Now the next shortcode meta character is backslash w, which is used to represent a word character. You might think at first that this only means alphabetical character, but it is not quite right. In regular expression a word character represents anything from capital A to capital Z, lowercase capital a to z, and digits from 0 to 9 and underscore. You can achieve the same thing using this Regex[a-zA-z0-9_].

Then there is the next shortcode backslash capital W, which does the opposite things from the above character. It means it matches anything that is not a word character. There is an alternate regex that does the same thing.

The next shortcode meta character is backslash s, which is used to represent any white space character and this includes any spaces, tabs, new lines, etc. Capital S does the opposite things. It matches any non-whitespace characters. Now you have an idea of what shortcode meta characters do. The shortcode characters save you a lot of typing.

Now let's move on to the demo part so you can learn how to use the shortcode meta characters in your expression. In this demo, we will use the same tool which we have used in the previous demo part.

Here in the input text, I have defined some statements but in some patterns. The first statement contains only characters, the second statement contains character and numeric words, and the third statement contains only digits. The fourth statement contains only spaces, and the fifth statement contains characters and spaces.

Regular Expression Shortcodes

First, write the regex but using character class, and then we will use the shortcode to see the effect. Add the character range. Now here I am anchoring the regex to the beginning and the end of the string so we can see the effect properly. This regex matches the first three test strings. Change this character range to a shortcode and see if we get the same result.

If I add slash w over here, then again we will get the same result. Now try the digit shortcode and it works as expected, it matches only the digit statement. If I add the backslash s then it matches now spaces. What about combining the shortcode in character class? For example, if I add slash w over here now it matches all the statements as we combine the backslash w and backslash s in a character class.

Reg Ex Shortcodes

I want to match anything other than a digit. Add Capital D. It started matching the character other than a digit. I want to exclude digits as well as spaces. For that let’s match the digit and space first. My requirement is I want to exclude this digit and space. For that, let me add the negative character over here. Now it matches only characters.

Let's jump into the next demo. Here in the input text, I have defined a list of hl7 files and from them, I want to find a hl7 file that starts with the three word characters followed by six digits date.

One can write a simple regular expression by using backslash w shortcode character and backslash d character followed by underscore backslash d. Add the anchor at the start of the regex so that it will not match the last filename because it is not starting with the first three characters. Now, this regular expression starts matching the file, which starts with the first three characters, followed by an underscore, and followed by six digits date.

Regular Expression Codes

This regular expression looks a little bit lengthy, I can make it short by using a range quantifier. Add the range quantifier or backslash w for digit. This time, again we will get the same result. It looks much better than the previous one. This is how the shortcode saves a lot of typing.

Web API for developers Free Trial Offline SDK

Here's RegEx video tutorial:

Other useful articles:


Back to top

© , Regexsonline.com — All Rights Reserved - Terms of Use - Privacy Policy