Link Search Menu Expand Document

How to Find Phone Numbers without a Specific Format

Manually extracting information from text documents can be time-consuming and error-prone. For instance, if you want to extract phone numbers from various text documents manually, there is a chance that you can make a mistake while saving the numbers. In addition, if there are thousands of text documents, it can take days or even weeks to compile a record of all the phone numbers.

A more practical approach would be to write a program that reads a text document and then extracts desired information such as phone numbers etc. In this article, you will see how C# regular expressions can be used to find phone numbers from a text document.

Table of Contents

  1. Finding Phone Numbers without a Specific Format
  2. Finding Phone Numbers with a Specific Format

Finding Phone Numbers without a Specific Format

Using regex you can find phone numbers without a specific format. To do so, you simply need to specify the text after which a phone number is most likely to occur. If you look at the following text invoice, you can see that the phone number is mentioned after the text “Phone # :”.

RegEx Find Phone Numbers

To extract such a phone number, you can write a regex expression: Phone # : \s\d. The regex patterns tell the regex to include the string “Phone # : ” followed by any number of spaces and then numbers, in the final output.  This pattern can be passed to the Match() function of the regex module as shown in the following script. At the beginning of the script below, you use the File.ReadAllText() method which reads a text document and returns the document text in the form of a C# string. 

using System;
using System.IO;
using System.Text;
using System.Text.RegularExpressions;

namespace RegexCodes
{
class Program
{
static void Main(string[] args)
{
string textFile = File.ReadAllText(@"E:\Datasets\invoice.txt", Encoding.UTF8);

Console.WriteLine("====================");
var myRegex = new Regex(@"(Phone # :\s*\d*)", RegexOptions.IgnoreCase);

string result = myRegex.Match(textFile).ToString();

Console.WriteLine(result);
}
}
}

In the output below, you can see that the phone number is returned.

Output:

RegEx How to Find Phone Numbers

Finding Phone Numbers with a Specific Format

In the previous section, you saw how to find a phone number without having a specific format. Oftentimes, you need to extract phone numbers written in a specific format. For instance, the US-based phone numbers are written in the format XXX-XXX-XXXX. Look at the following document. Here the phone number mentioned is 333-888-4444.

How to Find Phone Numbers

To read phone number in the XXX-XXX-XXXX format, you can use the regex expression “(?\d{3})?-? *\d{3}-? *-?\d{4}”. This regex expression returns digits written in the format XXX-XXX-XXXX. If you want you can prefix the regex expression “Phone : “ before the regex expression ”(?\d{3})?-? *\d{3}-? *-?\d{4}”. This way, the phone number in the format XXX-XXX-XXXX that is mentioned after the text “Phone : “ will be returned. Look at the following script for reference.

using System;
using System.IO;
using System.Text;
using System.Text.RegularExpressions;

namespace RegexCodes
{
class Program
{
static void Main(string[] args)
{
string textFile = File.ReadAllText(@"E:\Datasets\invoice.txt", Encoding.UTF8);

Console.WriteLine("====================");
var myRegex = new Regex((@"Phone : \(?\d{3}\)?-? *\d{3}-? *-?\d{4}"), RegexOptions.IgnoreCase);

string result = myRegex.Match(textFile).ToString();

Console.WriteLine(result);
}
}
}

In the output below, you can see the phone number in the format XXX-XXX-XXXX.

Output:

Find Phone Numbers

If you try to remove the first leading digit 3 from the phone number in the actual text invoice, you will see that nothing is returned because the regex expression will not match.

Other useful articles:


Back to top

© , Regexsonline.com — All Rights Reserved - Terms of Use - Privacy Policy