Sort command is used to sort the data available in files. We can arrange the data in specific order based on the requirement. we can apply sort command in both string as well as numeric data by using switches provided by sort command. Let see how sort works in Unix/Linux with the help of some examples. Lets start..!
1. Sort string data
Lets assume you have file with below sample data.
TRANS
FIN
MED
HR
CSR
FIN
$ sort F_Input_File.txt
Output: Will sort the file in alphabetical order and will produce below output:
CSR
FIN
FIN
HR
MED
TRANS
2. Sort numeric data
Lets assume you have file with below sample data.
10
110
09
21
34
15
$ sort F_Input_File.txt
Output: Will produced below output based on ASCCI values. But this is not what we are looking for.
09
10
110
15
21
34
This is numeric data and sort has sorted it based on ascii, to sort numeric data sort provides a septate switch -n. below will provide the required o/p.
$ sort -n F_Input_File.txt
Output:
09
10
15
21
34
110
3. Reverse the sort output
We can sort the data in reverse order as well. In previous case data is being sorted in Ascending order. Below will sort the data in Descending order
$ sort -n -r F_Input_File.txt
[or]
$ sort -r -n F_Input_File.txt
Output: Both will provide the same output. -n force for numeric sort and -r reverse the operation of sort.
110
34
21
15
10
09
By default sort sort the data in ascending order, by using -r switch we can sort in descending order ( same as oracle sql order clause)
4. Sort the data and return unique records
Lets assume you have file with below sample data.
TRANS
FIN
MED
HR
CSR
FIN
$ sort -u F_Input_File.txt
Output: Above will return below output, -u switch force sort to return the unique records.
CSR
FIN
HR
MED
TRANS
Explanation: Sort first sort the data and -u make sure that only one occurrence of record would be available in the output.
5. Don't sort the data if already sorted
Let say we have below sorted data available in input file.
CSR
FIN
HR
MED
TRANS
$ sort -c F_Input_File.txt
Output: Will not perform any action as data is already sorted
6. Redirect sort result into output file
String Sort
$ sort -o F_Output_File.txt F_Input_File.txt
Output: It will sort the data available in F_Input_File.txt to output file F_Output_File.txt
Numeric Sort
$ sort -n -o F_Output_File.txt F_Input_File.txt
Output: Here if input file has numeric data then it will first sort the data based on numer and then redirect the output to the output file.
Reverse Sort
$ sort -n -r -o F_Output_File.txt F_Input_File.txt
Output: Here output file will have the sorted data in reverse order, there fore in descending order.
7. Sort the Data on a specifid field in file.
This is most important part of sort as we are going to use this very often on data files. In real time scenarios, we receives data in data files where we have to sort the data on a specific column or multiple columns. Let see how to implement it.
SYNTAX:
$ sort -t<delimiter> km,n File_name
-t is used to hold the delimiter, space is default delimiter
-k stands for key and used to provide the column/field on which you want to perform your sorting.
m,n is used to provide the window (eg. from col1 to col4) on which you want to perform sorting.
Lets assume we have a pipe(|) delimited data file as shown below.
A5|CSR|N|50000
A7|FIN|N|40000
A6|TRANS|Y|30000
A4|HR|Y|20000
A2|MED|Y|20000
A3|FIN|N|10000
A1|TRANS|Y|100000
If we perform simple sort, it will return the below output:
$ sort F_Input_File.txt
A1|TRANS|Y|100000
A2|MED|Y|20000
A3|FIN|N|10000
A4|HR|Y|20000
A5|CSR|N|50000
A6|TRANS|Y|30000
A7|FIN|Y|40000
Output: Has sort the data based on complete line not on any specific field.
8. sort based on salary(last field)
$ sort -t"|" -k4,4 F_Input_File.txt
Output: Opps, this is not what we are expecting, A1 should be the last record, but as we know the column on which we are performing sort operation is number we have to use -n switch for correct result.
A3|FIN|N|10000
A1|TRANS|Y|100000
A2|MED|Y|20000
A4|HR|Y|20000
A6|TRANS|Y|30000
A7|FIN|N|40000
A5|CSR|N|50000
$ sort -t"|" -n -k4,4 F_Input_File.txt
[or]
$ sort -t"|" -nk4,4 F_Input_File.txt
Note: You must you -km,n syntax to perform the sort even if you are performing sort based on a single column or field because of below reasons:
1. -k2 means sort the data starting from 2nd field to last field.
2. -k2,2 means sort the data based on 2nd column only.
Output: Now we have correct output.
A3|FIN|N|10000
A2|MED|Y|20000
A4|HR|Y|20000
A6|TRANS|Y|30000
A7|FIN|N|40000
A5|CSR|N|50000
A1|TRANS|Y|100000
9. sort based on department(2nd field)
$ sort -t"|" -k2,2 F_Input_File.txt
Output:
A5|CSR|N|50000
A3|FIN|N|10000
A7|FIN|Y|40000
A4|HR|Y|20000
A2|MED|Y|20000
A1|TRANS|Y|100000
A6|TRANS|Y|30000
10. Sort the data based on the department and then list down the employee based on their salary.
$ sort -t"|" -k2,2 -k4,4 F_Input_File.txt
Output: It reads, first sort the data based on 2nd column and then on 4th column but again 4th column is numeric and we have not used the -n switch hence not received the expected output.
A5|CSR|N|50000
A3|FIN|N|10000
A7|FIN|Y|40000
A4|HR|Y|20000
A2|MED|Y|20000
A1|TRANS|Y|100000
A6|TRANS|Y|30000
To get the correct output below need to be used.
$ sort -t"|" -k2,2 -k4,4n F_Input_File.txt
Output: This time we have correct output as per the requirement.
A5|CSR|N|50000
A3|FIN|N|10000
A7|FIN|Y|40000
A4|HR|Y|20000
A2|MED|Y|20000
A6|TRANS|Y|30000
A1|TRANS|Y|100000
Reverse the sort data.
$ sort -t"|" -r -k2,2 -k4,4n F_Input_File.txt
Output:
A6|TRANS|Y|30000
A1|TRANS|Y|100000
A2|MED|Y|20000
A4|HR|Y|20000
A3|FIN|N|10000
A7|FIN|Y|40000
A5|CSR|N|50000
Conclusion: We will come across situation or requirement where we need data in a specific order either in ascending or descending. Sort command is very handy to arrange the data in a specified order. Keep learning, Keep practising ...!!