Learn to use Simple Patterns in Perl Programming

1
9909
Perl-(16)---Pattern-Matching-(1)-740X296

Perl-(16)---Pattern-Matching-(1)-740X296

If you remember, I told you in the beginning of this series that Perl stands for Practical Extraction and Reporting Language. Okay, let’s understand what we mean with extraction and reporting in the name. Of course, it is data! The capability to extract specific portions of data based on certain criteria (patterns) and to present this data to the user in well-formatted reports is what makes Perl the most powerful text-processing language in our programming world. In this article and its successors, we are going to investigate in detail how to exploit the power of Perl in string handling. An important, yet exciting series to follow; so, enjoy.

Regular Expressions

A pattern or regular expression is a sequence of characters to match strings of text against. A string may match the pattern, or mayn’t. If you are a UNIX/Linux guy, you should have used the grep command to look for certain patterns in one or more files. What grep actually does is pattern matching.

Matching against Simple Patterns

The pattern to match strings against consists of literal (fixed) characters, is called a simple pattern. A search pattern is written between two forward slashes:

When string is matched against a regular expression, the result will be either:

  • true, if match occurred.
  • false, if no match.

For this reason, pattern matching is usually used for decision making with if, unless, and while.

Example
The following script asks the user to enter his full name. If the name contains the pattern “Mohamed”, a message is printed to the user that a match occurred. If not, the opposite message is printed.

Let’s see how this will behave:
1

The following expression is an example for the simplest form of pattern matching:

It matches the default variable $_ against the literal pattern “Mohamed”.

Note
An important note to remember is that the pattern /Mohamed/ is not the same as /mohamed/ or /MOHAMED/
Regular expressions do care about case. To ignore the case, specify the i option after the closing / of the pattern:
/mohamed/i

Matching the Start and End of Strings

To match the beginning of a string, use the caret character ^ before the pattern.

Conversely, to match the end of a string, add the dollar sign $ suffix to the end of your pattern.

To match the whole string (from start to end), use both ^ and

Example
The following is a script that searches a given configuration file for commented lines, and prints them.

Let’s see how this script is going to behave when executed for the /etc/sysconfig/selinux file.
2

Now, to the explanation:

  • The diamond <> operator reads one or more files, provided as command-line argument(s), line by line. The while loop iterates over the read lines. In each iteration, it checks if the line in hand starts with #. This is done by matching the line against the pattern /^#/. Remember that most of UNIX/Linux configuration files use the # character to comment a line (this is the case also for Perl script files). Any line started by # is considered as a comment and ignored.

  • If the line matches the pattern (i.e. the result of matching is true), the logical and operator will cause the right-side part to be executed, so the line will be printed.
  • The statement:

is equivalent to saying

Learn the Basics of C Programming Language

The Pattern Matching and Non-Matching Operators

Other than the special case of matching the default variable, to match a string (either literal or stored in a variable), Perl has two operators for both matching and non-matching.

OperatorsFunctionSyntax
=~Matches the string on its left side to the pattern on the right side. Returns true if match occurred, and false if not.$VAR =~ /PATTERN/;
!=The opposite of the =~ operator. Returns true if the string on its left doesn’t match the pattern to its right.$VAR !=~ /PATTERN/;

Example
The following script reports input words starting with letter ‘A’.

Let’s see it in action:
5

Example
The following script returns uncommented lines in a provided Perl/Shell script file.

Let’s execute it for a script that contains some commented lines:
3

It should return the file without any commented lines:
4

Summary

In this article, we started talking about Pattern Matching using Regular Expressions.

  • A pattern or regular expression is a sequence of characters to match strings of text against.
  • Patterns may be simple (literal) or more complicated.
  • The caret character ^ matches the start of a string.
  • The dollar sign $ matches the end of a string.
  • The =~ and != operators are used to test if a string matches a regular expression.

In the next article, we will continue with Pattern Matching. So, see you.

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here