Tuesday, 24 November 2015

How to Calculate RANK Using AWK

Input File: F_Data_File.txt

10001|A1|TRANS|Y|10000
10002|A2|MED|Y|20000
10003|A3|FIN|N|10000
10004|A4|HR|Y|20000
10005|A5|CSR|N|50000
10006|A6|TRANS|Y|30000
10007|A7|FIN|N|40000

Desired: We want to give rank based on the salary of employees, same salary employee should get same rank.

10005|A5|CSR|N|50000|1
10007|A7|FIN|N|40000|2
10006|A6|TRANS|Y|30000|3
10004|A4|HR|Y|20000|4
10002|A2|MED|Y|20000|4
10003|A3|FIN|N|10000|5
10001|A1|TRANS|Y|10000|5

Solution:

sort -nr -k5 -t"|" F_Data_File.txt |\
awk -F"|" '{
   if(V_Temp != $5)
 {k++ #--*k=k+1
 }
 {V_Temp=$5}
 }
 {print $0 "|" k
}'

(or)

sort -nr -k5 -t"|" F_Data_File.txt | \
awk -F"|" '{
if(V_Temp!=$5)
{k++ #--*k=k+1
}
{V_Temp=$5
}
}
{print $0 FS k}';

(or) #--If you want a header in output then we need to initialize the same in BEGIN.

sort -nr -k5 -t"|" F_Data_File.txt | \
awk -F"|" 'BEGIN{print "EMPID|ENAME|DEPT|FLAG|SALA|RANK"}
          {
            if(V_Temp!=$5)
            {k++ #--*k=k+1
            }
            {V_Temp=$5
            }
         }
         {print $0 FS k}';

Note: Observe how rank is being populated, once using concatenation and one using inbuilt function FS.

Explanation:

1. sort -nr -k5 -t"|", sort function sort the data based on key field 5 that is salary column.
2. Once data is sorted same is piped to AWK command for further processing.
3. When AWK process first row, it will check whether TEMP column V_Temp is equal to $5 (Salary) as at initial level TEMP column will not hold anything, this condition will be TRUE and will set the value of k as 1.
4. For 2nd row V_TEMP will have salary of previous row hence if second record also have same salary then if condition will fail and k will not increment. It will set the same value of k to the second record as well. of if has different salary then it will increment the value of k to 2 and assign it to 2nd row.
5. Same will be followed till the end.

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...