| ThinkDigit Home |
|
||||||||||||
|
|||||||||||||
|
|||||||||||||
|
||||||||||||
|
Regular expressions are a way to describe patterns of text that can be useful for processing text documents or wherever one might want to look for a pattern and possibly replace another.
Introduction
Imagine you have a rather long document with a single misspelling. Imagine a Mr. Verma is displeased that his surname has been misspelled as "Varma". It's simple enough; even a text editor such as notepad can perform a search and replace operation for something as simple as this.
However if we start making it even a little more complicated, if we are searching for a pattern instead of something fixed, such simple measures start to fail. Imagine then if you have to replace every occurrence of a pattern of URLs with another pattern. For example, if due to a restructuring on a website, URLs that used to have the pattern:
http://www.website.com/[day]/[month]/[year]/articlename.html
now becomes:
http://www.website.com/[year]/[month]/[day]/articlename.html
What can one do? Searching and replacing will not do the trick here, and unless you are dealing with very few URLS, this is just too much to do manually. Or imagine if you are looking for a sequence of two words, and you would like to count how many times they occur in some text, the catch here being that these two words could be separated by any kind of whitespace, a tab, a space, a line break, etc.
These situations are easily handled by Regular Expressions.
What are Regular Expressions?
Regular expressions (Regex) are a way to define a pattern to be extracted / replaced / processed in a body of text. Many programming languages support regular expressions, either as part of that language or as part of a library.
The exact syntax and usage of Regex in different programs differs, however, the basic principles remain the same. Many applications, such as Notepad++, TextPad, even LibreOffice / OpenOffice include support for regular expressions. Linux users might be aware of the grep command, which is a Regex search engine tools for your text files or input. The Linux grep command is powerful enough to let you search through the files on your system digesting data, looking for the pattern you provide.

|
|
| groups, regular expressions |

