AWK Basic Syntax:
awk -F"<Delimiter>" 'BEGIN{INITIALIZATION}
{ACTION} #-- For every line in file
END{END BLOCK}' <FILE_NAME>
1. BEGIN BLCOK: Block executes only once at the start of AWK command, used for initialization.
2. EXECUTION BLOCK: Block is the heart of AWK command carry all the processing logic.
3. END BLOCK: Block executes only once in the END.
4. FILE_NAME: holds the file that needs to be processed by AWK.
AWK views a text file as records and fields represent by $1 (First Field),$2 (Second Field)...so on. $0 has special meaning in AWK, it represent complete record means, it hold the current processing line from the file. Please go through earlier post on Inbuilt Variables in AWK.
Input File: F_Data_File.txt
EMPID|EMPNAME|EMPDEPT|EMPSAL|LOCATION
10001|A1|HR|10000|USA
10002|A2|FIN|20000|USA
10003|A3|NSS|30000|IND
10004|A4|SEC|40000|USA
10005|A5|TECH|50000|IND
10006|A6|TECH|60000|IND
10007|A7|TECH|70000|IND
1. Display complete file
$ awk '{print}' F_Data_File.txt
(or)
$ awk '{print $0}' F_Data_File.txt
2. Print only first field from the file
awk '{print $1}' F_Data_File.txt
Output:
EMPID|EMPNAME|EMPDEPT|EMPSAL|LOCATION
10001|A1|HR|10000|USA
10002|A2|FIN|20000|USA
10003|A3|NSS|30000|IND
10004|A4|SEC|40000|USA
10005|A5|TECH|50000|IND
10006|A6|TECH|60000|IND
10007|A7|TECH|70000|IND
Has printed complete file as we have not provided any delimiter(default delimiter is SPACE).How to give delimiter? Let see.
$ awk -F"|" '{print $1}' F_Data_File.txt
Check -F switch, this switch is used to provide field separator. Above command will print only first field from the pipe delimited file.
Output:
EMPID
10001
10002
10003
10004
10005
10006
10007
3. print 1st & 2nd field separated by space
$ awk -F"|" '{print $1 " " $2}' F_Data_File.txt
4. print 1st & 2nd field separated by <==>
$ awk -F"|" '{print $1 "<==>" $2}' F_Data_File.txt
4. How to Initialize field Separator
$ awk 'BEGIN{FS="|";OFS=",";}{print $1 OFS $2 OFS $3 OFS $4 OFS $5}' F_Data_File.txt
Here we have Initialized Input field separator (FS) and output Field Separator(OFS)
The output will be displayed as comma(,) separated.
Output:
EMPID,EMPNAME,EMPDEPT,EMPSAL,LOCATION
10001,A1,HR,10000,USA
10002,A2,FIN,20000,USA
10003,A3,NSS,30000,IND
10004,A4,SEC,40000,USA
10005,A5,TECH,50000,IND
10006,A6,TECH,60000,IND
10007,A7,TECH,70000,IND
5. Print row number in front of every record
$ awk 'BEGIN{FS="|";}{print NR ":" $0}' F_Data_File.txt
Output:
1:EMPID|EMPNAME|EMPDEPT|EMPSAL|LOCATION
2:10001|A1|HR|10000|USA
3:10002|A2|FIN|20000|USA
4:10003|A3|NSS|30000|IND
5:10004|A4|SEC|40000|USA
6:10005|A5|TECH|50000|IND
7:10006|A6|TECH|60000|IND
8:10007|A7|TECH|70000|IND
6. Calculate Number of Field from every row
$ awk 'BEGIN{FS="|";}{print "Number of Field in Row:" NR " is=>" NF}' F_Data_File.txt
Output:
Number of Field in Row:1 is=>5
Number of Field in Row:2 is=>5
Number of Field in Row:3 is=>5
Number of Field in Row:4 is=>5
Number of Field in Row:5 is=>5
Number of Field in Row:6 is=>5
Number of Field in Row:7 is=>5
Number of Field in Row:8 is=>5
7. Print first & Last field from the file
$ awk 'BEGIN{FS="|"}{print $1 "-->" $NF}' F_Data_File.txt
Output:
EMPID-->LOCATION
10001-->USA
10002-->USA
10003-->IND
10004-->USA
10005-->IND
10006-->IND
10007-->IND
8. Print only First line of the file
$ awk 'BEGIN{FS="|";}NR==1{print $0}' F_Data_File.txt
Output:
EMPID|EMPNAME|EMPDEPT|EMPSAL|LOCATION
9. Read file from 3rd row
$ awk 'BEGIN{FS="|";}NR>=3{print $0}' F_Data_File.txt
Output:
10002|A2|FIN|20000|USA
10003|A3|NSS|30000|IND
10004|A4|SEC|40000|USA
10005|A5|TECH|50000|IND
10006|A6|TECH|60000|IND
10007|A7|TECH|70000|IND
10. Read line in multiple of 3 i.e. 3rd,5th,8th so on
$ awk 'BEGIN{FS="|";}NR%3==0{print $0}' F_Data_File.txt
Output:
10002|A2|FIN|20000|USA
10005|A5|TECH|50000|IND
11. Print LAST line of the File
$ awk 'END{print}' F_Data_File.txt
12. Count number of Rows in a file (wc-l F_Data_File.txt)
$ awk -F"|" '{V_Row_Cnt++}END{print V_Row_Cnt}' F_Data_File.txt
(or)
$ awk 'END { print NR }' F_Data_File.txt
Output: 8
13. Calculate total salary of the employees:
$ awk -F"|" 'NR>1{V_Sum_Sal=$4 + V_Sum_Sal }END{print V_Sum_Sal}' F_Data_File.txt
Output: 280000
NR>1 as first row is header.
14. Count empty lines from the file.
$ awk 'NF==0{print NR":"}' F_Data_File.txt
(or)
$ awk '/^$/{print NR":"}' F_Data_File.txt
Output: Will display the line number which are empty
5:
7:
NF stands for number of fields, if number of fields are 0 means row does not have any data. ^ represent start of the line and $ represent end of line, and if there is nothing in between 6 and $ means line is empty.
$ awk 'NF==0{V_Count++}{print V_Count}' F_Data_File.txt
(or)
$ awk '/^$/{V_Count++}{print V_Count}' F_Data_File.txt
Output: 2
15. Remove Empty lines from the file.
$ awk 'NF' F_Data_File.txt
$ awk 'NF > 0' F_Data_File.txt
$ awk '!NF==0{print NR":"}' F_Data_File.txt
$ awk '!/^$/{print NR":"}' F_Data_File.txt
$ awk '/./{print NR":"}' F_Data_File.txt
Output: All the above command will display non-empty lines available in the file.
awk -F"<Delimiter>" 'BEGIN{INITIALIZATION}
{ACTION} #-- For every line in file
END{END BLOCK}' <FILE_NAME>
1. BEGIN BLCOK: Block executes only once at the start of AWK command, used for initialization.
2. EXECUTION BLOCK: Block is the heart of AWK command carry all the processing logic.
3. END BLOCK: Block executes only once in the END.
4. FILE_NAME: holds the file that needs to be processed by AWK.
AWK views a text file as records and fields represent by $1 (First Field),$2 (Second Field)...so on. $0 has special meaning in AWK, it represent complete record means, it hold the current processing line from the file. Please go through earlier post on Inbuilt Variables in AWK.
Input File: F_Data_File.txt
EMPID|EMPNAME|EMPDEPT|EMPSAL|LOCATION
10001|A1|HR|10000|USA
10002|A2|FIN|20000|USA
10003|A3|NSS|30000|IND
10004|A4|SEC|40000|USA
10005|A5|TECH|50000|IND
10006|A6|TECH|60000|IND
10007|A7|TECH|70000|IND
1. Display complete file
$ awk '{print}' F_Data_File.txt
(or)
$ awk '{print $0}' F_Data_File.txt
2. Print only first field from the file
awk '{print $1}' F_Data_File.txt
Output:
EMPID|EMPNAME|EMPDEPT|EMPSAL|LOCATION
10001|A1|HR|10000|USA
10002|A2|FIN|20000|USA
10003|A3|NSS|30000|IND
10004|A4|SEC|40000|USA
10005|A5|TECH|50000|IND
10006|A6|TECH|60000|IND
10007|A7|TECH|70000|IND
Has printed complete file as we have not provided any delimiter(default delimiter is SPACE).How to give delimiter? Let see.
$ awk -F"|" '{print $1}' F_Data_File.txt
Check -F switch, this switch is used to provide field separator. Above command will print only first field from the pipe delimited file.
Output:
EMPID
10001
10002
10003
10004
10005
10006
10007
3. print 1st & 2nd field separated by space
$ awk -F"|" '{print $1 " " $2}' F_Data_File.txt
4. print 1st & 2nd field separated by <==>
$ awk -F"|" '{print $1 "<==>" $2}' F_Data_File.txt
4. How to Initialize field Separator
$ awk 'BEGIN{FS="|";OFS=",";}{print $1 OFS $2 OFS $3 OFS $4 OFS $5}' F_Data_File.txt
Here we have Initialized Input field separator (FS) and output Field Separator(OFS)
The output will be displayed as comma(,) separated.
Output:
EMPID,EMPNAME,EMPDEPT,EMPSAL,LOCATION
10001,A1,HR,10000,USA
10002,A2,FIN,20000,USA
10003,A3,NSS,30000,IND
10004,A4,SEC,40000,USA
10005,A5,TECH,50000,IND
10006,A6,TECH,60000,IND
10007,A7,TECH,70000,IND
5. Print row number in front of every record
$ awk 'BEGIN{FS="|";}{print NR ":" $0}' F_Data_File.txt
Output:
1:EMPID|EMPNAME|EMPDEPT|EMPSAL|LOCATION
2:10001|A1|HR|10000|USA
3:10002|A2|FIN|20000|USA
4:10003|A3|NSS|30000|IND
5:10004|A4|SEC|40000|USA
6:10005|A5|TECH|50000|IND
7:10006|A6|TECH|60000|IND
8:10007|A7|TECH|70000|IND
6. Calculate Number of Field from every row
$ awk 'BEGIN{FS="|";}{print "Number of Field in Row:" NR " is=>" NF}' F_Data_File.txt
Output:
Number of Field in Row:1 is=>5
Number of Field in Row:2 is=>5
Number of Field in Row:3 is=>5
Number of Field in Row:4 is=>5
Number of Field in Row:5 is=>5
Number of Field in Row:6 is=>5
Number of Field in Row:7 is=>5
Number of Field in Row:8 is=>5
7. Print first & Last field from the file
$ awk 'BEGIN{FS="|"}{print $1 "-->" $NF}' F_Data_File.txt
Output:
EMPID-->LOCATION
10001-->USA
10002-->USA
10003-->IND
10004-->USA
10005-->IND
10006-->IND
10007-->IND
8. Print only First line of the file
$ awk 'BEGIN{FS="|";}NR==1{print $0}' F_Data_File.txt
Output:
EMPID|EMPNAME|EMPDEPT|EMPSAL|LOCATION
9. Read file from 3rd row
$ awk 'BEGIN{FS="|";}NR>=3{print $0}' F_Data_File.txt
Output:
10002|A2|FIN|20000|USA
10003|A3|NSS|30000|IND
10004|A4|SEC|40000|USA
10005|A5|TECH|50000|IND
10006|A6|TECH|60000|IND
10007|A7|TECH|70000|IND
10. Read line in multiple of 3 i.e. 3rd,5th,8th so on
$ awk 'BEGIN{FS="|";}NR%3==0{print $0}' F_Data_File.txt
Output:
10002|A2|FIN|20000|USA
10005|A5|TECH|50000|IND
11. Print LAST line of the File
$ awk 'END{print}' F_Data_File.txt
12. Count number of Rows in a file (wc-l F_Data_File.txt)
$ awk -F"|" '{V_Row_Cnt++}END{print V_Row_Cnt}' F_Data_File.txt
(or)
$ awk 'END { print NR }' F_Data_File.txt
Output: 8
13. Calculate total salary of the employees:
$ awk -F"|" 'NR>1{V_Sum_Sal=$4 + V_Sum_Sal }END{print V_Sum_Sal}' F_Data_File.txt
Output: 280000
NR>1 as first row is header.
14. Count empty lines from the file.
$ awk 'NF==0{print NR":"}' F_Data_File.txt
(or)
$ awk '/^$/{print NR":"}' F_Data_File.txt
Output: Will display the line number which are empty
5:
7:
NF stands for number of fields, if number of fields are 0 means row does not have any data. ^ represent start of the line and $ represent end of line, and if there is nothing in between 6 and $ means line is empty.
$ awk 'NF==0{V_Count++}{print V_Count}' F_Data_File.txt
(or)
$ awk '/^$/{V_Count++}{print V_Count}' F_Data_File.txt
Output: 2
15. Remove Empty lines from the file.
$ awk 'NF' F_Data_File.txt
$ awk 'NF > 0' F_Data_File.txt
$ awk '!NF==0{print NR":"}' F_Data_File.txt
$ awk '!/^$/{print NR":"}' F_Data_File.txt
$ awk '/./{print NR":"}' F_Data_File.txt
Output: All the above command will display non-empty lines available in the file.
No comments:
Post a Comment