How to Find Phone Numbers without a Specific Format
Manually extracting information from text documents can be time-consuming and error-prone. For instance, if you want to extract phone numbers from various text documents manually, there is a chance that you can make a mistake while saving the numbers. In addition, if there are thousands of text documents, it can take days or even weeks to compile a record of all the phone numbers.
A more practical approach would be to write a program that reads a text document and then extracts desired information such as phone numbers etc. In this article, you will see how C# regular expressions can be used to find phone numbers from a text document.
Table of Contents
Finding Phone Numbers without a Specific Format
Using regex you can find phone numbers without a specific format. To do so, you simply need to specify the text after which a phone number is most likely to occur. If you look at the following text invoice, you can see that the phone number is mentioned after the text “Phone # :”.
To extract such a phone number, you can write a regex expression: Phone # : \s\d. The regex patterns tell the regex to include the string “Phone # : ” followed by any number of spaces and then numbers, in the final output. This pattern can be passed to the Match() function of the regex module as shown in the following script. At the beginning of the script below, you use the File.ReadAllText() method which reads a text document and returns the document text in the form of a C# string.
using System; using System.IO; using System.Text; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string textFile = File.ReadAllText(@"E:\Datasets\invoice.txt", Encoding.UTF8); Console.WriteLine("===================="); var myRegex = new Regex(@"(Phone # :\s*\d*)", RegexOptions.IgnoreCase); string result = myRegex.Match(textFile).ToString(); Console.WriteLine(result); } } }
In the output below, you can see that the phone number is returned.
Output:
Finding Phone Numbers with a Specific Format
In the previous section, you saw how to find a phone number without having a specific format. Oftentimes, you need to extract phone numbers written in a specific format. For instance, the US-based phone numbers are written in the format XXX-XXX-XXXX. Look at the following document. Here the phone number mentioned is 333-888-4444.
To read phone number in the XXX-XXX-XXXX format, you can use the regex expression “(?\d{3})?-? *\d{3}-? *-?\d{4}”. This regex expression returns digits written in the format XXX-XXX-XXXX. If you want you can prefix the regex expression “Phone : “ before the regex expression ”(?\d{3})?-? *\d{3}-? *-?\d{4}”. This way, the phone number in the format XXX-XXX-XXXX that is mentioned after the text “Phone : “ will be returned. Look at the following script for reference.
using System; using System.IO; using System.Text; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string textFile = File.ReadAllText(@"E:\Datasets\invoice.txt", Encoding.UTF8); Console.WriteLine("===================="); var myRegex = new Regex((@"Phone : \(?\d{3}\)?-? *\d{3}-? *-?\d{4}"), RegexOptions.IgnoreCase); string result = myRegex.Match(textFile).ToString(); Console.WriteLine(result); } } }
In the output below, you can see the phone number in the format XXX-XXX-XXXX.
Output:
If you try to remove the first leading digit 3 from the phone number in the actual text invoice, you will see that nothing is returned because the regex expression will not match.
Other useful articles:
- How to Use RegEx for Data Extraction
- How to Find Total Tax Using a Regular Expression in C#
- How to Find a Number Using Regular Expressions in C#
- How to Find Invoice Numbers Using Regular Expressions in C#
- Find SSN Using a Regular Expression in C#
- Find Total Amount Using a Regular Expression in C#
- How to Find Website Links using Regex
- Module 1: Regular Expressions for Beginners
- Module 1: Regex Usage and Tool Demo
- Module 2: Regex Engine Basics (Part 1)
- Module 2: Regex Engine Basics (Part 2)
- Module 2: Regex Syntax in Detail (Part 1)
- Module 2: Regex Syntax in Detail (Part 2)
- Module 2: Quantifiers in Reg Ex for Beginners
- Module 2: Short Codes in Reg Ex for Beginners
- Module 2: Anchors and Boundaries in Detail
- Module 2: Grouping and Subpattern in Detail
- Module 3: Realtime Use Case of Regular Expressions - Part 1
- Module 3: Realtime Use Case of Regular Expressions - Part 2
- Module 3: Realtime Use Case of Regular Expressions - Part 3
- Module 3: Realtime Use Case of Regular Expressions - Part 4
- How to Find Quantity Field Using Regular Expression in C#
- How to Find Phone Numbers without a Specific Format
- How to Find Date Using Regular Expression in C#
- How to Find Time Using Regular Expression in C#
- How to Find a Sentence Using Regular Expressions in C#
- Find a Word Using Regular Expression in C#
- Find a Due Date using Regular Expressions in C#
- How to Find the End of a String Using Regular Expression in C
- How to Find the Start of a String Using Regular Expression in C
- How to Find a Comma using Regular Expression in C Sharp
- How to Find a Dot using Regular Expression in C
- How to Find a Semicolon using Regular Expression in C Sharp
- How to Find a Double Space using Regular Expression in C