ThinkDigit Home
Subscribe to the Newsletter
Search
 
SKOAR!       / SHOPPING
 
 
News / Features / Downloads / Channel 5
 
 
Regular Expressions
 
Posted by
Kshitij Sobti
2
210
Posted on: Oct 21, 2011 11:15:35 IST

 
 
 

Regular expressions are a way to describe patterns of text that can be useful for processing text documents or wherever one might want to look for a pattern and possibly replace another.

Introduction

Imagine you have a rather long document with a single misspelling. Imagine a Mr. Verma is displeased that his surname has been misspelled as "Varma".  It's simple enough; even a text editor such as notepad can perform a search and replace operation for something as simple as this.

However if we start making it even a little more complicated, if we are searching for a pattern instead of something fixed, such simple measures start to fail. Imagine then if you have to replace every occurrence of a pattern of URLs with another pattern.  For example, if due to a restructuring on a website, URLs that used to have the pattern:

http://www.website.com/[day]/[month]/[year]/articlename.html

now becomes:

http://www.website.com/[year]/[month]/[day]/articlename.html

What can one do? Searching and replacing will not do the trick here, and unless you are dealing with very few URLS, this is just too much to do manually. Or imagine if you are looking for a sequence of two words, and you would like to count how many times they occur in some text, the catch here being that these two words could be separated by any kind of whitespace, a tab, a space, a line break, etc.

These situations are easily handled by Regular Expressions.

What are Regular Expressions?

Regular expressions (Regex) are a way to define a pattern to be extracted / replaced / processed in a body of text. Many programming languages support regular expressions, either as part of that language or as part of a library.

The exact syntax and usage of Regex in different programs differs, however, the basic principles remain the same. Many applications, such as Notepad++, TextPad, even LibreOffice / OpenOffice include support for regular expressions. Linux users might be aware of the grep command, which is a Regex search engine tools for your text files or input.  The Linux grep command is powerful enough to let you search through the files on your system digesting data, looking for the pattern you provide.

    
Next >






 
 
 
Latest Features
 
 
 
 
Comments 2comments
 
Posted by Kshitij Sobti on Oct 22,2011
 
It is a misspelling if your actual name is 'Verma'. ;-) I used that example because it is an actual mistake I made. People are often very particular about their surnames.
 
Posted by Varma on Oct 21,2011
 
Nice article, by the way 'Varma' is not a wrong spelling, it does exist in some places :)
 
 



 
 

 
 
 
 
Newsletter Subscription
 
 
 
 
 
 
 
 
 
 
 
 
 
 
http://devworks.thinkdigit.com