Regular Expressions

Use regular expressions

/regexp/

awk '/li/ { print $2 }' mail-list

exp ~ /regexp/

awk '$1 ~ /^J/' inventory-shipped

exp !~ /regexp/

awk '$1 !~ /^J/' inventory-shipped
echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'

Dynamic regexps

ls -l | awk 'BEGIN { digits_regexp = "[[:digit:]]+" } $5 ~ digits_regexp { print }'
ls -l | awk 'BEGIN { digits_regexp = "[[:alpha:]]+" } $4 ~ digits_regexp { print }'

POSIX character classes

Class Description
[:alnum:] Alphanumeric characters
[:alpha:] Alphabetic characters
[:blank:] Space and TAB characters
[:cntrl:] Control characters
[:digit:] Numeric characters
[:graph:] Characters that are both printable and visible [1]
[:lower:] Lowercase alphabetic characters
[:print:] Printable characters [2]
[:punct:] Punctuation characters [3]
[:space:] Space characters [4]
[:upper:] Uppercase alphabetic characters
[:xdigit:] Characters that are hexadecimal digits

Ignore case

IGNORECASE = [0,1]

echo -e "ab\ncd" | awk 'BEGIN { IGNORECASE = 1 } /A/ { print }'

Footnote

[1]A space is printable but not visible, whereas an ‘ a ’ is both.
[2]Characters that are not control characters.
[3]Characters that are not letters, digits, control characters, or space characters.
[4]Such as space, TAB, and formfeed, to name a few.