Patterns, Actions, and Variables

[pattern] { action }
pattern [{ action }]

Pattern Elements

Patterns control the execution of rules. A rule is executed when its pattern matches the current input record.

/regular expression/ a regular expression
expression a single expression
begpat, endpat a pair of patterns
BEGIN special patterns to supply startup or cleanup actions
BEGINFILE special patterns to supply startup or cleanup actions perfile basis
empty the empty pattern matches every input record

Expressions as Patterns

awk '$1 == "li" { print $2 }' mail-list

Regular Expressions as Patterns

awk '$1 ~ /li/ { print $2 }' mail-list
awk '/edu/ && /li/' mail-list
awk '/edu/ || /li/' mail-list
awk '! /li/' mail-list

Specifying Record Ranges with Patterns

A range pattern is used to match ranges of consecutive input records.


cat myfile
no      first   100     65
on      user1   1       12
on      user2   4       345
off     user3   12      73
no      last    2       123

awk '$1 == "on", $1 == "off" { printf "%s %-3s %-3s\n", $2, $3, $4 }' myfile

The BEGIN and END Special Patterns

The BEGIN and END patterns supply startup and cleanup actions.
BEGIN and END rules must have actions.

Startup and cleanup actions

awk 'BEGIN { print "Analysis of \"li\"" }
   /li/ { ++n }
   END { print "\"li\" appears in", n, "records." }' mail-list

Input/output from BEGIN and END rules


  • Be aware of referencing $0.
  • The next and nextfile statements are not allowed.

The BEGINFILE and ENDFILE Special Patterns

FILENAME is set to the name of the current file, and FNR is set to zero. ERRNO is set. The next statement is not allowed.

The Empty Pattern

An empty pattern match every input record.

awk '{ print $1 }' mail-list

Using Shell Variables in Programs

  • Variable substitution via quoting:
printf "Enter search pattern: "; read pattern
Enter search pattern: ri
awk '$1 ~ '"/$pattern/"'{ nmatches++ }
END { print nmatches, "found."}' mail-list
1 found.
  • awk’s variable assignment, assign the shell variable’s value to an awk variable.
printf "Enter search pattern: "; read pattern
Enter search pattern: li
awk -v pat="$pattern" '$1 ~ pat { nmatches++ }
END { print nmatches, "found."}' mail-list
2 found.


An action consists of one or more awk statements.
Action could be omitted if pattern i defined.
awk '/li/' mail-list

Types of statements:

  • Expressions
  • Control statements
  • Compound statements
  • Input statements
  • Output statements
  • Deletion statements

Control Statements in Actions

The if-else Statement

if (condition) then-body [else else-body]

awk '{ if ( $2 ~ /99/ ) print }' mail-list

The while Statement

while (condition)
awk '{ i = 1 ; while ( i <= 3 ) { print $i ; i++ } }' inventory-shipped

The do-while Statement

while (`condition`)
awk '{ i = 1 ; do { print $0 ; i++ } while ( i <= 5 ) }' inventory-shipped

The for Statement

for (initialization; condition; increment)
awk '{ for ( i = 1 ; i <= 3 ; i++ ) print $i }' inventory-shipped

The switch Statement

switch (expression) {
case value or regular expression:
awk '{ switch ($1) {
case "Bill":
    print $1, "was here"
case "Julie":
    print $1, "was here"
} }' mail-list

The break Statement

The break statement jumps out of the innermost for , while , or do loop.

The continue Statement

The continue statement is used only inside for , while , and do loops causing the next cycle around the loop to begin immediately.

The next Statement

The next statement forces awk to immediately stop processing the current record and go on to the next record.

awk '{ if ( $1 !~ /Bill|Julie/ ) next ; else print }' mail-list

The nextfile Statement

The nextfile statement instructs awk to stop processing the current datafile.

awk '{ if ( $1 !~ /Bill|Julie/ ) print ; else nextfile }' mail-list

The exit Statement

exit [return code]

awk '{ if ( $1 == "Bill" ) exit 1 "Bill scares me" ; else print }' mail-list

Predefined Variables

Built-in Variables That Control awk

BINMODE # specifies use of binary mode for all I/O
CONVFMT controls the conversion of numbers to strings (“%.6g”)
FIELDWIDTHS # space-separated list of columns
FPAT # regexp that tells gawk to create the fields based on regexp match
FS input field separator
IGNORECASE # if non-zero/null, comparison & regexp matching are case-independent
LINT # if true, provides warnings about constructs
OFMT controls the conversion of numbers to strings
OFS output field separator
ORS output record separator
PREC # working precision of arbitrary-precision floating-point numbers
ROUNDMODE # rounding mode to use for arbitrary-precision arithmetic on numbers
RS input record separator
SUBSEP subscript separator used in indices of array’s separation
TEXTDOMAIN # used for internationalization (“messages”)

Built-in Variables That Convey Information

ARGC number of command-line arguments
ARGV command-line arguments stored in an array
ARGIND # index in ARGV of the current file
ENVIRON associative array containing the values of the environment
ERRNO # string describing the error (getline or close)
FILENAME name of the current input file
FNR current record number in the current file
NF number of fields in the current input record
FUNCTAB # array of all functions in the program
NR number of input records awk has processed
PROCINFO # array of informations about the running awk program
RLENGTH length of the substring matched by match()
RSTART start index in characters of the substring matched by match()
RT # input text that matched the text denoted by RS
SYMTAB # array of all defined global variables and arrays in the program
awk -v foo=4 'BEGIN { SYMTAB["foo"] = "toto" ; print foo, ENVIRON["HOME"] }'

Using ARGC and ARGV

awk 'BEGIN { for ( i = 0 ; i < ARGC ; i++ )
printf "\tARGV[%d] = %s\n", i, ARGV[i] }' toto tata