Learn to Use grep Command in Perl Programming

0
2782
Perl-18

Perl-18

How can we make Perl emulate the job of grep command-line tool in UNIX and Linux?
To solve the problem, we need first to know what the grep command does. In its basic use, grep can search a file, multiple files, or an entire directory for a pattern. If there is no match, nothing gets printed. If a line (within a file) matches the pattern, the filename is printed, followed by the matching line, and separated by a colon:. If only file is being searched, printing the filename will have no meaning, so it shouldn’t be printed.

Okay, let’s do it!

Will this code do the job?! Let’s see!

When searching inside one file:
1

When searching for the same pattern in two files:
2

Now, to the explanation:

As the grep command does, our script should expect two or more arguments. The first argument is the pattern to match against, and the rest of arguments (one or more) should be the files to search for the pattern in.

  • The line:

Uses the shift function to remove the first element (argument) from the array of command-line arguments @ARGV. The removed element which is expected to be the pattern to search for is assigned to the scalar variable $PATTERN.
Now, the array of arguments contains only the names of the files to search in.

  • The if condition:

checks if the “new” (after removing the first element) number of @ARGV elements (command-line arguments) is greater than one. If so, this means the pattern will be searched for in more than one file. In this case, it will be convenient to print the filename in front of each matched line.

    • The while loop:

Uses the diamond operator <> to search the provided files line-by-line for the pattern. If a line matches the pattern, the current filename (the file currently being searched) is printed (the $ARGV variable) followed by the matched line, and separated from it by a semi-colon.

    • If not (only one file will be searched), no need to print the filename on match occurrences. So, only the matching line is printed.

That is it!

Memorizing and Reusing Matched Patterns
When a pattern (or part of it) is enclosed within pair of parentheses and a match occurs, the strings (or substrings) that match the enclosed pattern are saved in a special variable, so that they can be reused if desired.

Watch the following example.

Example
The following script searches the provided file(s) for the pattern /r./ that matches any appearance of r followed by any character (except the newline character). When a line matches that pattern, the substring that matches is printed, followed by the full line that contains it.

For a file that contains the following lines:
3

The script should behave as follows:
4

The $1 variable refers to the string that matched the parenthesized pattern. Similarly, if there is $1, certainly there could be $2, $3, and etc. Following the number of patterns enclosed by the pairs of parentheses, there will be a corresponding number of positional variables $1, $2, etc.

Processing Matched Patterns
Instead of just matching and reporting (returning true when match occurs), Perl offers a means for processing (changing) matched strings. This section is going to discuss how to replace and translate the part of a string that matches a specified pattern.

Learn the Basics of C Programming Language

Match and Replace
The substitution operator s/// is used to replace a string that matches a search pattern.

Syntax

Example
Starting with the great quote of Gandhi:
5

The following statement will replace only the “first” occurrence of the sequence “Your”:

6

The following will replace “all” occurrences of “Your” by “ur”:

7

The trailing option g instructs Perl to replace “globally” all the occurrences of the pattern searched for.

To match and replace and all occurrences of “Your” ignoring case:
8

Match and Translate
The translation operator tr/// is used to replace “characters” in the search pattern with their corresponding ones in the translation set. Translation is done for the individual characters, on a character-by-character basis, not a string-for-string replacement. So, the first character in the search pattern is translated (when seen) to the first character in the replacement sequence. Similarly, the second character in the search list will be translated to the second character in the replacement character list, and etc.

Syntax

Example
Using the same quote by Gandhi, the following will translate all letters from lowercase to uppercase:

9

Example
The following statement will replace any occurrence of either e, j, p, q, or y by a, g, b, k, and i, respectively:

10

Summary
That was part three in the “Pattern Matching using Regular Expressions” subject.

  • A matched string can be saved (memorized) for later use. This is done by enclosing the search pattern inside parentheses ( ).
  • A matched string could be modified either by being “entirely” substituted (replaced) by another string, or by having its individual characters translated to another set of characters (one-by-one). For substitution, the s/// operator is used. For translation, the tr/// operator is used.

In the next article, we are going to set Pattern Matching aside, and start a new topic: One-Liners. See you.

LEAVE A REPLY

Please enter your comment!
Please enter your name here