Awk: comparing numbers in two files
Given two files, each with the first column serving as a key and the second column as value; how can we obtain the percentage difference between corresponding keys in the file?
For example, if a.txt has data:
key1 10 key2 20 key3 30
While b.txt has data:
key1 5 key2 10 key3 20
We might be interested in percentage difference between the values, i.e.:
key1 50 key2 50 key3 33.3
The following awk script can achieve this, while also skipping any keys not common to both files:
awk '
BEGIN {
while (getline < "a.txt") {arr\[$1\] = $2}
} {
if (length(arr\[$1\])==0)
{ print FILENAME":" $0 }
else arr2\[$1\]=$2
}
END {
for (key in arr)
if (arr2\[key\]>0)
{ print (arr2\[key\]-arr\[key\])\*100/arr2\[key\] }
}' b.txt
To achieve the same for more than two files, we can modify the while loop in BEGIN to read the contents of all but the last file into a distinct array and then, in END, comparing the last file’s current key to see if it exists in all arrays before processing the array’s contents.
To find the number of columns in the last line of all files in a directory:
#!/bin/bash
for file in \*
do
awk -F"\\t" 'END {print FILENAME, NF}' $file
done
#