How to Find Total Tax Using a Regular Expression in C#
This article explains how to find total tax from text documents using C# with the help of regular expressions. The article shows how to extract the tax amount that is present with or without currency symbols in a text document.
Finding Total Tax Containing Numbers Only
Consider a scenario where you have the following receipt. Here the tax contains only a number i.e. 15 and no currency symbol or decimal point.
To find tax from the above document, you can use the following regex expression.
using System; using System.Collections.Generic; using System.IO; using System.Linq; using System.Text; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string textFile = File.ReadAllText(@"E:\Datasets\invoice.txt", Encoding.UTF8); Console.WriteLine("===================="); var myRegex = new Regex(@"(Total Tax: \s*\d*)", RegexOptions.IgnoreCase); string result = myRegex.Match(textFile).ToString(); Console.WriteLine(result); } } }
The above script reads a text file that contains tax information and then retrieves any string of numbers that follows the word “Total Tax”. The Regex expression used to retrieve all a continuous list of numbers is “\s\d”. Here “\s” stands for any space characters after “Total Tax” and “\d” returns all the numbers after space. Here is the output of the above script.
==================== Total Tax: 15
Finding Total Tax Containing Currency Symbols and Decimals
Tax information often contains currency symbols and decimals, for instance in the following invoice:
To read tax information in such a format, you need to modify the regex expression as follows:
using System; using System.Collections.Generic; using System.IO; using System.Linq; using System.Text; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string textFile = File.ReadAllText(@"E:\Datasets\invoice.txt", Encoding.UTF8); Console.WriteLine("===================="); var myRegex = new Regex(@"(Total Tax:\s*\$?\s*\d*(\.\d{1,3}))", RegexOptions.IgnoreCase); string result = myRegex.Match(textFile).ToString(); Console.WriteLine(result); } } }
The regular expression used in the above script is “(Total Tax:\s$?\s\d*(.\d{1,3}))”. Here $? specifies that the expression may or may not start with a “$” sign. The expression “.” searches decimal whereas “\d{1,3}” looks for up to three decimal places in the text. The output shows the total tax including the dollar sign and decimal points.
==================== Total Tax: $15.25
In the script above, the word “Total Tax” is also displayed with the tax amount. If you wish to only retrieve the tax amount, you can use the following script.
using System; using System.Collections.Generic; using System.IO; using System.Linq; using System.Text; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string textFile = File.ReadAllText(@"E:\Datasets\invoice.txt", Encoding.UTF8); Console.WriteLine("===================="); var myRegex = new Regex(@"(Total Tax Amount:\s*\$?\s*\d*(\.\d{1,3}))", RegexOptions.IgnoreCase); string result = myRegex.Match(textFile).ToString(); string total_tax= result.Split(":")[1]; Console.WriteLine(total_tax.Trim()); } } }
In the script above the string returned by the regular expression is split using a colon and then the second part of the string i.e. the one containing the amount is printed. In the output below, you can see the tax amount only.
==================== $15.25
Other useful articles:
- How to Use RegEx for Data Extraction
- How to Find Total Tax Using a Regular Expression in C#
- How to Find a Number Using Regular Expressions in C#
- How to Find Invoice Numbers Using Regular Expressions in C#
- Find SSN Using a Regular Expression in C#
- Find Total Amount Using a Regular Expression in C#
- How to Find Website Links using Regex
- Module 1: Regular Expressions for Beginners
- Module 1: Regex Usage and Tool Demo
- Module 2: Regex Engine Basics (Part 1)
- Module 2: Regex Engine Basics (Part 2)
- Module 2: Regex Syntax in Detail (Part 1)
- Module 2: Regex Syntax in Detail (Part 2)
- Module 2: Quantifiers in Reg Ex for Beginners
- Module 2: Short Codes in Reg Ex for Beginners
- Module 2: Anchors and Boundaries in Detail
- Module 2: Grouping and Subpattern in Detail
- Module 3: Realtime Use Case of Regular Expressions - Part 1
- Module 3: Realtime Use Case of Regular Expressions - Part 2
- Module 3: Realtime Use Case of Regular Expressions - Part 3
- Module 3: Realtime Use Case of Regular Expressions - Part 4
- How to Find Quantity Field Using Regular Expression in C#
- How to Find Phone Numbers without a Specific Format
- How to Find Date Using Regular Expression in C#
- How to Find Time Using Regular Expression in C#
- How to Find a Sentence Using Regular Expressions in C#
- Find a Word Using Regular Expression in C#
- Find a Due Date using Regular Expressions in C#
- How to Find the End of a String Using Regular Expression in C
- How to Find the Start of a String Using Regular Expression in C
- How to Find a Comma using Regular Expression in C Sharp
- How to Find a Dot using Regular Expression in C
- How to Find a Semicolon using Regular Expression in C Sharp
- How to Find a Double Space using Regular Expression in C