Link Search Menu Expand Document

How to Find a Due Date using Regular Expressions in C#

Extracting due dates from the documents can be an important task. For instance, a  company might want to get information about the invoices that have not been paid before the due date. Finding due dates manually from thousands of documents can be cumbersome. Automatic extraction can save time and human resources. In this article, you will see how to extract the due date from a text document using the C# regex expression.

As an example, you will see how to find the due date from the following text invoice using regular expressions in C#. For the sake of experimentation, we name the following text file “invoice.txt”.

Regex C Find a Due Date

How to Find a Due Date without a Specific Format

In this section, you will see how to find the due date that is not in any specific format. The trick here is to simply extract any string that follows the word “Due Date:”. The regular expression that matches the string that follows the word “Due Date:” is @”(Due Date: .*).  I 

Look at the following string. Here, you first read all the text from the “invoice.txt” file using the File.ReadAllText() function.

Next, you create an object of the Regex class from C#. To find the due date, you need to call the Match() function from the Regex class object and pass it the file text and your regex expression. The Match() function returns the matched string which corresponds to the due date.

using System;
using System.IO;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace RegexCodes
{
    class Program
    {
        static void Main(string[] args)
        {
            string textFile = File.ReadAllText(@"D:\Datasets\invoice.txt", Encoding.UTF8);
            Console.WriteLine("====================");
            var myRegex = new Regex(@"(Due Date: .*)", RegexOptions.IgnoreCase);
            string result = myRegex.Match(textFile).ToString();
            Console.WriteLine(result);
            Console.ReadLine();
        }
    }
}

Here is the output of the above script, you can see that the due date has been successfully extracted. Regex C Sharp Find a Due Date

The problem with the above script is that it will match any string that follows the “Due Date:” even if it's some random text. To make sure that you only extract text that is in a specific date format, you can use a regular expression that specifies the format. You will see that in the next section.

How to Find Due Date with a Specific Format

In this section, you will extract a string that follows the word “Due Date:” in the format dd-dd-dddd (two digits followed by a dash, then two digits followed by a dash, and then four digits). The regex expression used for that will be: (Due Date: [0-9]?[0-9]-[0-9]{2}-[0-9]{4}).

Execute the following script to see this in action:

using System;
using System.IO;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

namespace RegexCodes
{
    class Program
    {
        static void Main(string[] args)
        {
            string textFile = File.ReadAllText(@"D:\Datasets\invoice.txt", Encoding.UTF8);

            Console.WriteLine("====================");
            var myRegex = new Regex(@"(Due Date: [0-9]?[0-9]-[0-9]{2}-[0-9]{4})", RegexOptions.IgnoreCase);
            string result = myRegex.Match(textFile).ToString();
            Console.WriteLine(result);
            Console.ReadLine();
        }
    }
}

Regex C Sharp How to Find a Due Date

You can also specify multiple date formats with regex. For example the regular expression (Due Date: [0-9]?[0-9](-|/)[0-9]{2}(-|/)[0-9]{4}) returns dates in both dd-dd-dddd and dd/dd/dddd formats. Here, you specify both dash(-) and forward-slash (/) in parenthesis with an or symbol (|). You can also add other delimiters such as a period, a backslash or any other special character if you want.

Update your text invoice as follows. Here you can see the date in the dd/dd/dddd format.

Regex Find a Due Date C Sharp Run the following script:

using System;
using System.IO;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

namespace RegexCodes
{
    class Program
    {
        static void Main(string[] args)
        {
            string textFile = File.ReadAllText(@"D:\Datasets\invoice.txt", Encoding.UTF8);

            Console.WriteLine("====================");
            var myRegex = new Regex(@"(Due Date: [0-9]?[0-9](-|/)[0-9]{2}(-|/)[0-9]{4})", RegexOptions.IgnoreCase);

            string result = myRegex.Match(textFile).ToString();
            Console.WriteLine(result);
            Console.ReadLine();
        }
    }
}

You will see that the due date has been extracted in dd/dd/dddd format.

Regex C Sharp Finding a Due Date

If you update the date to dd-dd-dddd format, you will see that the above script will be able to extract that too.

Other useful articles:


Back to top

© , Regexsonline.com — All Rights Reserved - Terms of Use - Privacy Policy