Ict-innovation/LPI/103.7

= 103.7 Using Regular Expressions = ''Candidates should be able to manipulate files and text data using regular expressions. This objective includes creating simple regular expressions containing several notational elements. It also includes using regular expression tools to perform searches through a filesystem or file content.''

Key Knowledge Areas


 * Create simple regular expressions containing several notational elements.
 * Use regular expression tools to perform searches through a filesystem or file content.

Overview
Finding a word or multiple words in a text is achieved using grep, fgrep or egrep. The keywords used during a search are a combination of letters called regular expressions. Regular expressions are recognised by many other applications such as sed, and vi.

Regular Expressions

Traditional Regular Expressions (regex)

A regular expression is a sequence of characters (or atoms) used to match a pattern. Characters are either constants (treated literally) or metacharacters.

Table1: Main metacharacters

Extended regex:

The main eregex’s are: +,?, and |

Table2: List of main eregex

The grep family
The grep utility supports regular expressions regex such as those listed in Table1.

Working with basic grep

Syntax for grep:

grep PATTERN FILE

Options for grep include:

For example list all non blank lines in /etc/lilo.conf:

egrep

The egrep tool supports extended regular expressions eregex such as those listed in Table2.

The egrep utility will handle any modern regular expressions. It can also search for several keywords if they are entered at the command line, separated by the vertical bar character.

For example:

fgrep

fgrep stands for fast grep and fgrep interprets strings literally (no regex or eregex support). The fgrep utility does not recognise the special meaning of the regular expressions.

For example:

will only match words containing ‘cat*’. The main improvement came from fgrep’s ability to search from a list of keywords entered line by line in a file, say LIST. The syntax would be

The Stream Editor - sed

sed performs automatic, non-interactive editing of files. It is often used in scripts to search and replace patterns in text. It supports most regular expressions.

Syntax for sed:

sed [options] 'command' [INPUTFILE] 

The input file is optional since sed also works on file redirections and pipes. Here are a few examples assuming we are working on a file called MODIF.

Delete all commented lines:

Notice that the search pattern is between the double slashes.

Substitute /dev/hda1 by /dev/sdb3:

The s in the command stands for ‘substitute’. The g stands for “globally” and forces the substitution to take place throughout each line. You can also specify which line numbers the substitutions should occur on, either using line numbers or regular expression match.

If the line contains the keyword KEY then substitute ‘:’ with ‘;’ globally:

More Advanced sed

You can issue several commands each starting with –e at the command line. For example, (1) delete all blank lines then (2) substitute ‘OLD’ by ‘NEW’ in the file MODIF

These commands can also be written to a file, say COMMANDS. Then each line is interpreted as a new command to execute (no quotes are needed).

The syntax to use this COMMANDS file is:

sed -f COMMANDS MODIF

This is much more compact than a very long command line !

Summary of options for sed

Used files, terms and utilities:


 * grep
 * egrep
 * fgrep
 * sed
 * regex(7)

Previous Chapter | Next Chapter