Desktop
System
- Ansible from scratch
- Ansible
- AWX
- Using Docker
- MySQL Replication
- Nginx
- Percona XtraDB Cluster
- SELinux Samba share
- Sphinx
- Systemd
Bash
Awk
- Getting Started with awk
- Running awk and gawk
- Regular Expressions
- Reading Input Files
- Record number
- Record splitting with standard awk
- Record splitting with gawk
- Fields
- Contents of a field
- How fields are separated
- Each character a separate field
- FS from the command line
- Field-splitting summary
- Record-splitting summary
- Multiple-line records
- Explicit input with getline
- Getline summary
- Input with a timeout
- Printing Output
- Expressions
- Patterns, Actions, and Variables
- Arrays in awk
- Functions
Arrays in awk¶
The Basics of Arrays¶
Introduction to Arrays¶
Arrays in awk are associative; numeric indices are converted automatically to strings. Each array is a collection of pairs, an index and its corresponding array element value.
Referring to an Array Element¶
array[index-expression]
To determine whether an element exists:
indx in array
awk 'BEGIN {
tab["dog"] = "chien"
tab["cat"] = "chat"
tab["one"] = "un"
} { if ("one" in tab) print tab["one"]
exit }'
Assigning Array Elements¶
array[index-expression] = value
Basic Array Example¶
cat tab_ex
5 I am the Five man
2 Who are you? The new number two!
4 . . . And four on the floor
1 Who is number one?
3 I three you.
awk '{
if ($1 > max)
max = $1
arr[$1] = $0
}
END {
for (x = 1; x <= max; x++)
if (x in arr)
print arr[x]
}' tab_ex
1 Who is number one?
2 Who are you? The new number two!
3 I three you.
4 . . . And four on the floor
5 I am the Five man
Scanning All Elements of an Array¶
sudo journalctl | awk '
$5 ~ /(.+)(:|\[.+\])*:{0,1}/{ name = gensub(/(.+)(\[.+\].*)/, "\\1", "g", $5 )
if ( counter[name] ~ /[0-9]+/ ) counter[name]++ ; else counter[name] = 1
if ( max < length(name) ) max = length(name)
}
END { command = "sort -nk 2"
for ( i in counter )
printf "%-"max"s %d\n", i, counter[i] | command
close(command)
}'
awk '{
for (i = 1; i <= NF; i++)
used[$i] = 1
}
END {
for (x in used) {
if (length(x) > 4) {
++num_long_words
print x
}}
print num_long_words, "words longer than 4 characters"
}' tab_ex
Using Predefined Array Scanning Orders with gawk¶
PROCINFO[“sorted_in”]
“@unsorted” |
arbitrary order, which is the default awk behavior. |
“@ind_str_asc” |
ascending order, compared as strings. |
“@ind_num_asc” |
ascending order, but force them to be treated as numbers |
“@val_type_asc” |
ascending order, by the type assigned to the element. |
“@val_str_asc” |
ascending order, by element values. Scalar values are compared as strings. |
“@val_num_asc” |
ascending order, by element values. Scalar values are compared as numbers. |
“@ind_str_desc” |
like “@ind_str_asc” , but ordered from high to low. |
“@ind_num_desc” |
like “@ind_num_asc” , but ordered from high to low. |
“@val_type_desc” |
like “@val_type_asc” , but ordered from high to low. |
“@val_str_desc” |
like “@val_str_asc” , but ordered from high to low. |
“@val_num_desc” |
like “@val_num_asc” , but ordered from high to low. |
awk 'BEGIN { PROCINFO["sorted_in"] = "@ind_str_asc" }
{ a[$1] = $1 }
END { for (i in a) print i }' mail-list
awk 'BEGIN { PROCINFO["sorted_in"] = "@ind_str_desc" }
{ a[$1] = $1 }
END { for (i in a) print i }' mail-list
Using Numbers to Subscript Arrays¶
Array subscripts are always strings. The predefined variable CONVFMT can affect how your program accesses elements of an array.
awk 'BEGIN {
xyz = 12.153
data[xyz] = 1
if (xyz in data)
printf "%s is in data\n", xyz
else
printf "%s is not in data\n", xyz}'
12.153 is in data
awk 'BEGIN {
xyz = 12.153
data[xyz] = 1
CONVFMT = "%2.2f"
if (xyz in data)
printf "%s is in data\n", xyz
else
printf "%s is not in data\n", xyz}'
12.15 is not in data
Using Uninitialized Variables as Subscripts¶
echo 'line 1
line 2
line 3' | awk '{ l[lines] = $0; ++lines }
END {
for (i = lines - 1; i >= 0; i--)
print l[i]
}'
line 3
line 2
echo 'line 1
line 2
line 3' | awk '{ l[lines++] = $0 }
END {
for (i = lines - 1; i >= 0; i--)
print l[i]
}'
line 3
line 2
line 1
The delete Statement¶
Multidimensional Arrays¶
Multidimensional arrays are supported through concatenation of indices into one string. awk converts the indices into strings. The separator used is the value of the built-in variable SUBSEP.
Scanning Multidimensional Arrays¶
Arrays of Arrays¶
awk 'BEGIN {
a[1][2] = "a b c d"
for (i in a)
if (isarray(a[i]))
for (j in a[i])
print a[i][j]
else
print a[i]
}'
a b c d