Find a Word Using Regular Expression in C#
This article shows how to find a word inside a text string using C# regular expression. Finding a sentence within a string is an important natural language processing task. Sentences that occur at the beginning or end of a sentence or strings that contain a particular word can convey important information about the text.
In this article, you will see how to search for a word in all the text, how to find a word at the beginning or end of a text string, and how to find words containing specific letters.
Searching for Existence of a Word
To search for the existence of a word, you can use the Matches() function from the Regex class. You have to create an object of the Regex class first. Inside the Regex class constructor, you simply need to pass the word that you want to search. Next, you can call the Matches() function from the regex class object and pass it the string inside which you want to search for a word. The following script uses C# regex to show how many times the word “alligator” appears inside a string. The regex expression used in the following script is simply the word that you want to search for i.e. “@alligator”.
using System; using System.IO; using System.Linq; using System.Text; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string textFile = "Here we have a big alligator, with large eyes. The alligator looks menacing."; Regex myRegex = new Regex(@"alligator"); var results = myRegex.Matches(textFile); if (results.Count > 0) Console.WriteLine("The word is found " + results.Count.ToString() + " times."); else Console.WriteLine("Word not found."); } } }
Finding the First Word
To find the first word of a text string, you can simply split the text using spaces and then get the first split text. You can pass the regex “\s+” to the Split() function of the Regex class which returns a list of all words. The regular expression “\s+”, when passed to the split() splits a string using empty spaces as a delimiter. You can extract the first item from the list which is also the first word of the text. Here is an example script.
The following script first splits the string “Eiffel-tower is located in France” into a list of 5 words. Next, from the list, the first word which is located at the 0th index of the split list is printed on the console.
using System; using System.IO; using System.Linq; using System.Text; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string textFile = "Eiffel-tower is located in France"; Regex myRegex = new Regex(@"\s+"); var results = myRegex.Split(textFile); Console.WriteLine("The first word is: " + results[0]); } } }
Finding the Last Word
Similarly, to get the last word of a text string, you can pass the regex “\s+” to the Split() function of the Regex class. The split() function will split the input string using empty space as a delimiter and will return the list of words. From the returned list of items, you can extract the last word. Look at the following script for reference.
In the script below, the split function splits the input string into 5 words. Since a list follows a zero-based index, therefore, the last word will occur at the 4th index. Therefore, to fetch the last word, the index value (results.length - 1) is passed to the results list.
using System; using System.IO; using System.Linq; using System.Text; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string textFile = "Eiffel-tower is located in France"; Regex myRegex = new Regex(@"\s+"); var results = myRegex.Split(textFile); Console.WriteLine("The last word is: " + results[results.Length - 1]); } } }
Finding Words with Specific Characters
To find words that contain a specific character, you can use the Matches() function from the Regex class. For instance, the regex “[^\s]*[t][^\s]” returns all the words that contain the letter “t”. The regex basically searches for a string that is surrounded by empty spaces and contains a letter t. Look at the following script for reference.
using System; using System.IO; using System.Linq; using System.Text; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string textFile = "Eiffel-tower is located in France"; Regex myRegex = new Regex(@"[^\s]*[t][^\s]*"); var results = myRegex.Matches(textFile); if (results.Count > 0) foreach (var word in results) Console.WriteLine(word.ToString()); else Console.WriteLine("Word not found."); } } }
Other useful articles:
- How to Use RegEx for Data Extraction
- How to Find Total Tax Using a Regular Expression in C#
- How to Find a Number Using Regular Expressions in C#
- How to Find Invoice Numbers Using Regular Expressions in C#
- Find SSN Using a Regular Expression in C#
- Find Total Amount Using a Regular Expression in C#
- How to Find Website Links using Regex
- Module 1: Regular Expressions for Beginners
- Module 1: Regex Usage and Tool Demo
- Module 2: Regex Engine Basics (Part 1)
- Module 2: Regex Engine Basics (Part 2)
- Module 2: Regex Syntax in Detail (Part 1)
- Module 2: Regex Syntax in Detail (Part 2)
- Module 2: Quantifiers in Reg Ex for Beginners
- Module 2: Short Codes in Reg Ex for Beginners
- Module 2: Anchors and Boundaries in Detail
- Module 2: Grouping and Subpattern in Detail
- Module 3: Realtime Use Case of Regular Expressions - Part 1
- Module 3: Realtime Use Case of Regular Expressions - Part 2
- Module 3: Realtime Use Case of Regular Expressions - Part 3
- Module 3: Realtime Use Case of Regular Expressions - Part 4
- How to Find Quantity Field Using Regular Expression in C#
- How to Find Phone Numbers without a Specific Format
- How to Find Date Using Regular Expression in C#
- How to Find Time Using Regular Expression in C#
- How to Find a Sentence Using Regular Expressions in C#
- Find a Word Using Regular Expression in C#
- Find a Due Date using Regular Expressions in C#
- How to Find the End of a String Using Regular Expression in C
- How to Find the Start of a String Using Regular Expression in C
- How to Find a Comma using Regular Expression in C Sharp
- How to Find a Dot using Regular Expression in C
- How to Find a Semicolon using Regular Expression in C Sharp
- How to Find a Double Space using Regular Expression in C