Link Search Menu Expand Document

Find a Word Using Regular Expression in C#

This article shows how to find a word inside a text string using C# regular expression. Finding a sentence within a string is an important natural language processing task. Sentences that occur at the beginning or end of a sentence or strings that contain a particular word can convey important information about the text.

In this article, you will see how to search for a word in all the text, how to find a word at the beginning or end of a text string, and how to find words containing specific letters.

Searching for Existence of a Word

To search for the existence of a word, you can use the Matches() function from the Regex class. You have to create an object of the Regex class first. Inside the Regex class constructor, you simply need to pass the word that you want to search. Next, you can call the Matches() function from the regex class object and pass it the string inside which you want to search for a word. The following script uses C# regex to show how many times the word “alligator” appears inside a string. The regex expression used in the following script is simply the word that you want to search for i.e. “@alligator”.

using System; using System.IO; using System.Linq; using System.Text; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string textFile = "Here we have a big alligator, with large eyes. The alligator looks menacing."; Regex myRegex = new Regex(@"alligator"); var results = myRegex.Matches(textFile); if (results.Count > 0) Console.WriteLine("The word is found " + results.Count.ToString() + " times."); else Console.WriteLine("Word not found."); } } }

Regex C Sharp Find Words

Finding the First Word 

To find the first word of a text string, you can simply split the text using spaces and then get the first split text. You can pass the regex “\s+” to the Split() function of the Regex class which returns a list of all words. The regular expression “\s+”, when passed to the split() splits a string using empty spaces as a delimiter. You can extract the first item from the list which is also the first word of the text. Here is an example script.

The following script first splits the string “Eiffel-tower is located in France” into a list of 5 words. Next, from the list, the first word which is located at the 0th index of the split list is printed on the console.

using System; using System.IO; using System.Linq; using System.Text; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string textFile = "Eiffel-tower is located in France"; Regex myRegex = new Regex(@"\s+"); var results = myRegex.Split(textFile); Console.WriteLine("The first word is: " + results[0]); } } }

Regex C Sharp Find a Word

Finding the Last Word

Similarly, to get the last word of a text string, you can pass the regex “\s+” to the Split() function of the Regex class. The split() function will split the input string using empty space as a delimiter and will return the list of words.  From the returned list of items, you can extract the last word. Look at the following script for reference.

In the script below, the split function splits the input string into 5 words. Since a list follows a zero-based index, therefore, the last word will occur at the 4th index. Therefore, to fetch the last word, the index value (results.length - 1) is passed to the results list.

using System; using System.IO; using System.Linq; using System.Text; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string textFile = "Eiffel-tower is located in France"; Regex myRegex = new Regex(@"\s+"); var results = myRegex.Split(textFile); Console.WriteLine("The last word is: " + results[results.Length - 1]); } } }

Regex C Find Words

Finding Words with Specific Characters

To find words that contain a specific character, you can use the Matches() function from the Regex class. For instance, the regex “[^\s]*[t][^\s]” returns all the words that contain the letter “t”. The regex basically searches for a string that is surrounded by empty spaces and contains a letter t.  Look at the following script for reference.

using System; using System.IO; using System.Linq; using System.Text; using System.Text.RegularExpressions; namespace RegexCodes { class Program { static void Main(string[] args) { string textFile = "Eiffel-tower is located in France"; Regex myRegex = new Regex(@"[^\s]*[t][^\s]*"); var results = myRegex.Matches(textFile); if (results.Count > 0) foreach (var word in results) Console.WriteLine(word.ToString()); else Console.WriteLine("Word not found."); } } }

Regex C Find a Word

Other useful articles:


Back to top

© , Regexsonline.com — All Rights Reserved - Terms of Use - Privacy Policy