Expressions

Constants, Variables, and Conversions

Constant expressions

Three kinds of constants: numeric, string, and regexp.

  • Numeric and string constants
  • Octal and hexadecimal numbers
awk 'BEGIN { printf "%d, %d, %d\n", 011, 11, 0x11 }'
  • Regular expression constants

Using regular expression constants

ls -1 | awk '{
if ($0 ~ /name/ || $0 ~ /phone/) print $0
}'
# is exactly equivalent to
ls -1 | awk '{
if (/name/ || /phone/) print $0
}'

Constant regular expressions are also used:

  • as the first argument for the gensub() , sub() , and gsub() functions
  • as the second argument of the match() function
  • as the third argument of the split() and patsplit() functions

Variables

Variables assignments:

  • variable=text
  • -v variable=text
awk '{ print $n }' n=4 inventory-shipped n=2 mail-list

awk -v n=4 '{ print $n }' inventory-shipped n=2 mail-list

Conversion of strings and numbers

awk program make conversion if the context demands it.

awk 'BEGIN { t = 2 ; r = 3
print (t r) + 3
print t + r + 3
print t r 3
}'
26
8
233

CONVFMT The conversion format for numbers, “%.6g”, by default.

Locales can influence conversion

Locale decimal point versus a period

Feature Default –posix or –use-lc-numeric
%’g Use locale Use locale
%g Use period Use locale
Input Use period Use locale
strtonum() Use period Use locale
export export POSIXLY_CORRECT=1
gawk 'BEGIN { printf "%g\n", 3.1415927 }'
3.14159
LC_ALL=fr_FR.utf8 gawk 'BEGIN { printf "%g\n", 3.1415927 }'
3,14159
echo 4,321 | gawk '{ print $1 + 1 }'
5
echo 4,321 | LC_ALL=fr_FR.utf-8 gawk '{ print $1 + 1 }'
5,321

Operators: Doing Something with Values

Arithmetic Operators

x ^ y exponentiation
x ** y exponentiation
- x negation
+ x unary plus
x * y multiplication
x / y division
x % y remainder
x + y addition
x - y subtraction
cat grades
Pat   100 97 58
Sandy 84 72 93
Chris 72 92 89

awk '{ sum = $2 + $3 + $4 ; avg = sum / 3
     printf "%-5s => %d\n", $1, avg }' grades

String Concatenation

There is only one string operation: concatenation with no operator.

awk 'BEGIN { print -12 " " -24 }'
-12-24
awk 'BEGIN { print -12 " " (-24) }'
-12 -24

Assignment Expressions

Multiple assignments:

x = y = z = 5
Operator Effect
lvalue += increment add increment to the value of lvalue
lvalue -= decrement subtract decrement from the value of lvalue
lvalue *= coefficient multiply the value of lvalue by coefficient
lvalue /= divisor divide the value of lvalue by divisor
lvalue %= modulus set lvalue to its remainder by modulus
lvalue ^= power raise lvalue to the power power
lvalue **= power raise lvalue to the power power (c.e.)

Increment and Decrement Operators

++lvalue increment lvalue , returning the new value as the value of the expression
lvalue++ increment lvalue , returning the old value of lvalue as the value of the expression
- -lvalue decrement lvalue , returning the new value as the value of the expression
lvalue- - decrement lvalue , returning the old value of lvalue as the value of the expression
awk 'BEGIN { max = 3
    for (i=1; i <= max; i++)
        printf "file_%.3d\n", i
}'

Truth Values and Conditions

True and False in awk

true any nonzero numeric value or any nonempty string value
false any other value (zero or the null string, “” )

Variable Typing and Comparison Expressions

String type versus numeric type

UPPER operand type
lower comparison type
STRING NUMERIC STRNUM
STRING string string string
NUMERIC string numeric numeric
STRNUM strig numeric numeric

Comparison operators

Expression Result
x < y true if x is less than y
x <= y true if x is less than or equal to y
x > y true if x is greater than y
x >= y true if x is greater than or equal to y
x == y true if x is equal to y
x != y true if x is not equal to y
x ~ y true if the string x matches the regexp denoted by y
x !~ y true if the string x does not match the regexp denoted by y
subscript in array true if the array array has an element with the subscript subscript

String comparison with POSIX rules

POSIX string comparison is performed based on the locale’s collating order.

awk 'BEGIN { printf("ABC < abc = %s\n",
                   ("ABC" < "abc" ? "TRUE" : "FALSE")) }'
ABC < abc = TRUE

awk --posix 'BEGIN { printf("ABC < abc = %s\n",
                   ("ABC" < "abc" ? "TRUE" : "FALSE")) }'
ABC < abc = FALSE

Boolean Expressions

“or” || boolean1 || boolean2 true if boolean1 or boolean2 is true
“and” && boolean1 && boolean2 true if both boolean1 and boolean2 are true
“not” ! ! boolean true if boolean is false
awk 'BEGIN { if (! ("HOME" in ENVIRON))
print "no home!" }'

Conditional Expressions

selector ? if-true-exp : if-false-exp

awk 'BEGIN { "whoami" | getline who_ami ; close("whoami")
     printf "%s\n%s\n", ( who_ami ~ /guillaume|guigui/ ? "guisam is here" : "no guisam here" ), "bye"
     exit }'

Function Calls

Function calls return a value that may be used.

function(argument)

awk '{ print "The square root of", $1, "is", sqrt($1) }'

Type crtl-d to terminate.

Operator Precedence (How Operators Nest)

Operators in order of highest to lowest precedence
Operator Description
( … ) grouping
$ field reference
++ – increment, decrement
^ ** exponentiation, these operators group right to left
+ - ! unary plus, minus, logical “not.”
* / % multiplication, division, remainder
+ - addition, subtraction
no operator string concatenation
< <= == != > >= >> | |& relational and redirection
~ !~ matching, nonmatching
in array membership
&& logical “and.”
|| logical “or.”
?: conditional, this operator groups right to left
= += -= *= /= %= ^= **= assignment, these operators group right to left

Where You Are Makes a Difference

Locales can affect dates, times, the value of the decimal point character. Setting “LC_ALL=C” in the environment will give you much better performance.