Friday, 20 November 2015

How to Group By based on Multiple Columns

Input File:F_Data_File.txt

1001|A1|10|Y
1002|A2|20|Y
1003|A3|30|N
1004|A4|10|Y
1005|A5|10|N
1006|A6|20|N
1007|A7|20|N
1008|A8|30|N

Desired Output:

Dept_No Act_Flg==> Num_Of_Emp
----------------------
[10 N] ==>1
[20 Y] ==>1
[10 Y] ==>2
[30 N] ==>2
[20 N] ==>2

Solution:

$ awk 'BEGIN{FS="|"; print "Dept_No Act_Flg==> Num_Of_Emp\n----------------------\n"}
         {V_Dept_Emp_Cnt [$3" "$4] ++}
         END{for(k in V_Dept_Emp_Cnt) print "[" k,"]==>" V_Dept_Emp_Cnt[k]}' F_Data_File.txt

Explanation:

1.  In this case, Array Index for  Dept 10 will be look like as below
     V_Dept_Emp_Cnt[10 Y]
     V_Dept_Emp_Cnt[10 N]
2. For more explanation, please follow previous post How to Use Associative Array in AWK

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...