Friday, 25 December 2015

10 Practical Example Of SORT Command in Unix/Linux

Sort command is used to sort the data available in files. We can arrange the data in specific order based on the requirement. we can apply sort command in both string as well as numeric data by using switches provided by sort command. Let see how sort works in Unix/Linux with the help of some examples. Lets start..!

1. Sort string data

Lets assume you have file with below sample data.
TRANS
FIN
MED
HR
CSR
FIN

$ sort F_Input_File.txt

Output: Will sort the file in alphabetical order and will produce below output:
CSR
FIN
FIN
HR
MED
TRANS

2. Sort numeric data

Lets assume you have file with below sample data.
10
110
09
21
34
15

$ sort F_Input_File.txt

Output: Will produced below output based on ASCCI values. But this is not what we are looking for.
09
10
110
15
21
34

This is numeric data and sort has sorted it based on ascii, to sort numeric data sort provides a septate switch -n. below will provide the required o/p.

$ sort -n F_Input_File.txt

Output:
09
10
15
21
34
110

3. Reverse the sort output

We can sort the data in reverse order as well. In previous case data is being sorted in Ascending order. Below will sort the data in Descending order

$ sort -n -r F_Input_File.txt

[or]

$ sort -r -n F_Input_File.txt

Output: Both will provide the same output. -n force for numeric sort and -r reverse the operation of sort.
110
34
21
15
10
09

By default sort sort the data in ascending order, by using -r switch we can sort in descending order ( same as oracle sql order clause)

4. Sort the data and return unique records

Lets assume you have file with below sample data.
TRANS
FIN
MED
HR
CSR
FIN

$ sort -u F_Input_File.txt

Output: Above will return below output, -u switch force sort to return the unique records.
CSR
FIN
HR
MED
TRANS

Explanation: Sort first sort the data and -u make sure that only one occurrence of record would be available in the output.

5. Don't sort the data if already sorted

Let say we have below sorted data available in input file.
CSR
FIN
HR
MED
TRANS

$ sort -c F_Input_File.txt

Output: Will not perform any action as data is already sorted

6. Redirect sort result into output file

String Sort

$ sort -o F_Output_File.txt F_Input_File.txt

Output: It will sort the data available in F_Input_File.txt to output file F_Output_File.txt 

Numeric Sort
$ sort -n -o F_Output_File.txt F_Input_File.txt
Output: Here if input file has numeric data then it will first sort the data based on numer and then redirect the output to the output file. 

Reverse Sort

$ sort -n -r -o F_Output_File.txt F_Input_File.txt

Output: Here output file will have the sorted data in reverse order, there fore in descending order.

7. Sort the Data on a specifid field in file.

This is most important part of sort as we are going to use this very often on data files. In real time scenarios, we receives data in data files where we have to sort the data on a specific column or multiple columns. Let see how to implement it.

SYNTAX:
$ sort -t<delimiter> km,n File_name

-t is used to hold the delimiter, space is default delimiter
-k stands for key and used to provide the column/field on which you want to perform your sorting.
m,n is used to provide the window (eg. from col1 to col4) on which you want to perform sorting.

Lets assume we have a pipe(|) delimited data file as shown below.

A5|CSR|N|50000
A7|FIN|N|40000
A6|TRANS|Y|30000
A4|HR|Y|20000
A2|MED|Y|20000
A3|FIN|N|10000
A1|TRANS|Y|100000

If we perform simple sort, it will return the below output:

$ sort  F_Input_File.txt

A1|TRANS|Y|100000
A2|MED|Y|20000
A3|FIN|N|10000
A4|HR|Y|20000
A5|CSR|N|50000
A6|TRANS|Y|30000
A7|FIN|Y|40000

Output: Has sort the data based on complete line not on any specific field.

8. sort based on salary(last field)

$ sort -t"|" -k4,4 F_Input_File.txt

Output: Opps, this is not what we are expecting, A1 should be the last record, but as we know the column on which we are performing sort operation is number we have to use -n switch for correct result.

A3|FIN|N|10000
A1|TRANS|Y|100000
A2|MED|Y|20000
A4|HR|Y|20000
A6|TRANS|Y|30000
A7|FIN|N|40000
A5|CSR|N|50000

$ sort -t"|" -n -k4,4 F_Input_File.txt
[or]
$ sort -t"|"  -nk4,4 F_Input_File.txt


Note: You must you -km,n syntax to perform the sort even if you are performing sort based on a single column or field because of below reasons:

1. -k2 means sort the data starting from 2nd field to last field.
2. -k2,2 means sort the data based on 2nd column only.

Output: Now we have correct output.
A3|FIN|N|10000
A2|MED|Y|20000
A4|HR|Y|20000
A6|TRANS|Y|30000
A7|FIN|N|40000
A5|CSR|N|50000
A1|TRANS|Y|100000

9. sort based on department(2nd field)

$ sort -t"|" -k2,2 F_Input_File.txt

Output:
A5|CSR|N|50000
A3|FIN|N|10000
A7|FIN|Y|40000
A4|HR|Y|20000
A2|MED|Y|20000
A1|TRANS|Y|100000
A6|TRANS|Y|30000

10. Sort the data based on the department and then list down the employee based on their salary.

$ sort -t"|" -k2,2 -k4,4 F_Input_File.txt


Output: It reads, first sort the data based on 2nd column and then on 4th column but again 4th column is numeric and we have not used the -n switch hence not received the expected output. 

A5|CSR|N|50000
A3|FIN|N|10000
A7|FIN|Y|40000
A4|HR|Y|20000
A2|MED|Y|20000
A1|TRANS|Y|100000
A6|TRANS|Y|30000

To get the correct output below need to be used.

$ sort -t"|" -k2,2 -k4,4n F_Input_File.txt

Output: This time we have correct output as per the requirement.
A5|CSR|N|50000
A3|FIN|N|10000
A7|FIN|Y|40000
A4|HR|Y|20000
A2|MED|Y|20000
A6|TRANS|Y|30000
A1|TRANS|Y|100000

Reverse the sort data.

$ sort -t"|" -r -k2,2 -k4,4n F_Input_File.txt

Output:
A6|TRANS|Y|30000
A1|TRANS|Y|100000
A2|MED|Y|20000
A4|HR|Y|20000
A3|FIN|N|10000
A7|FIN|Y|40000
A5|CSR|N|50000


Conclusion: We will come across situation or requirement where we need data in a specific order either in ascending or descending.  Sort command is very handy to arrange the data in a specified order. Keep learning, Keep practising ...!!

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...