Monday, 16 November 2015

How To Look-Up in File Data Using AWK

Input Data : F_Data_File1.txt

EMPID
10001
10002
10003
10004
10008

Input Data : F_Data_File2.txt

EMPID|ENAME|DEPT
10001|A1|TRANS
10002|A2|MED
10003|A3|FIN
10004|A4|HR
10005|A5|CSR
10006|A6|TRANS
10007|A7|FIN
10008|A8|HR

Desired Output:

EMPID|ENAME|DEPT
10001|A1|TRANS
10002|A2|MED
10003|A3|FIN
10004|A4|HR
10008|A8|HR

Solution:

$ awk 'NR==FNR && FNR !=1 {A[$1];next} $1 in A' FS="|" F_Data_File1.txt F_Data_File2.txt

Debug mode:

$ awk '{printf("V_File_Name->[%s] V_Row_Counter->[%d] V_Row_Num_From_Curr_File->[%d] V_Data->[%s]\n", FILENAME, NR, FNR, $0)}' F_Data_File1.txt F_Data_File2.txt

Another Way of Writing:

$ awk '{print "V_File_Name->" FILENAME \
             "V_Row_Counter->"NR \
             "V_Row_Num_From_Curr_File->"FNR \
             "V_Data->"$0 \
           }' 
F1 F2

Output Debug mode:

V_File_Name->[F1] V_Row_Counter->[1] V_Row_Num_From_Curr_File->[1] V_Data->[EMPID|ELIG_IND|ACTV_IND] FNR != 1
V_File_Name->[F1] V_Row_Counter->[2] V_Row_Num_From_Curr_File->[2] V_Data->[10001|Y|A]
V_File_Name->[F1] V_Row_Counter->[3] V_Row_Num_From_Curr_File->[3] V_Data->[10002|Y|A]
V_File_Name->[F1] V_Row_Counter->[4] V_Row_Num_From_Curr_File->[4] V_Data->[10003|Y|A]
V_File_Name->[F1] V_Row_Counter->[5] V_Row_Num_From_Curr_File->[5] V_Data->[10004|Y|A]
V_File_Name->[F1] V_Row_Counter->[6] V_Row_Num_From_Curr_File->[6] V_Data->[10008|N|I]
#--* Till this point 1st file is being read and from here AWK will start reading 2nd file.
V_File_Name->[F2] V_Row_Counter->[7] V_Row_Num_From_Curr_File->[1] V_Data->[EMPID|ENAME|DEPT] FNR != 1
V_File_Name->[F2] V_Row_Counter->[8] V_Row_Num_From_Curr_File->[2] V_Data->[10001|A1|TRANS]
V_File_Name->[F2] V_Row_Counter->[9] V_Row_Num_From_Curr_File->[3] V_Data->[10002|A2|MED]
V_File_Name->[F2] V_Row_Counter->[10] V_Row_Num_From_Curr_File->[4] V_Data->[10003|A3|FIN]
V_File_Name->[F2] V_Row_Counter->[11] V_Row_Num_From_Curr_File->[5] V_Data->[10004|A4|HR]
V_File_Name->[F2] V_Row_Counter->[12] V_Row_Num_From_Curr_File->[6] V_Data->[10005|A5|CSR]
V_File_Name->[F2] V_Row_Counter->[13] V_Row_Num_From_Curr_File->[7] V_Data->[10006|A6|TRANS]
V_File_Name->[F2] V_Row_Counter->[14] V_Row_Num_From_Curr_File->[8] V_Data->[10007|A7|FIN]
V_File_Name->[F2] V_Row_Counter->[15] V_Row_Num_From_Curr_File->[9] V_Data->[10008|A8|HR]

Note: File which is used as a lookup should be first parameter of AWK command.

Explanation:

1. NR is an inbuilt AWK variable which holds the value of row number which is being processed by AWK till that point.
2. FNR is an inbuilt AWK variable which holds the value of row number of current file which is  being processed by AWK till that point.
3. FILENAME is an inbuilt AWK variable which holds the name of the file which AWK is currently processing.
3. See the Debug mode how exactly FNR & NR being used.
4. Row with FNR !=1 will not be processed.

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...