Tuesday, 26 January 2016

How To Zip/Unzip A File In Unix/Linux

In this article we will see how to zip/compress files as well as how to unzip/decompress the files. We will divide this article in three categorise as explained below:

Category:1-> Switches which will be applied on Normal files (unzipped file)
Category:2-> Switches which will be applied on zipped files.
Category:3-> gunzip to unzip the files.

We will see some of the below listed switches, you can get the list from man gzip or gzip -h.

-d, --decompress  decompress
-h, --help        give this help
-k, --keep        keep (don't delete) input files
-l, --list        list compressed file contents
-q, --quiet       suppress all warnings
-r, --recursive   operate recursively on directories
-t, --test        test compressed file integrity
-v, --verbose     verbose mode
-1, --fast        compress faster
-9, --best        compress better

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<CATEGORY:1>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

ZIP The Input File

$ gzip F_Input_File1.txt

Output: Will create a zip file
F_Input_File.txt1.gz

We can zip multiple files together as shown below:
$ gzip F_Input_File1.txt F_Input_File2.txt F_Input_File3.txt

Keep The Original File With Zipped File

The original file be lost once the zip file is created,if you want to keep the original file with the zipped file, use -k switch.

$ gzip -k F_Input_Fil1e.txt

Output: Will zip the file and keep the original file as well.
$ ls
F_Input_File1.txt
F_Input_File1.txt.gz

What gzip is Doing

$ gzip -v F_Input_File1.txt

Output: It will provide real time activities(verbose) which gzip is doing.

Zip All The File Recursively

$ gzip -v -r /home/baba

Output: The -r switch has forced gzip to perform recursive compression(zip).
/home/baba/F_Input_File1.txt:    11.1% -- replaced with /home/baba/F_Input_File1.txt.gz
/home/baba/F_Input_File2.txt:    51.6% -- replaced with /home/baba/F_Input_File2.txt.gz
/home/baba/Imp/F_Input_File1.txt:       -12.5% -- replaced with /home/baba/Imp/F_Input_File1.txt.gz

Zip All The File Recursively, Keeping Original Files

We can also do recursive zip keeping the original files.

$ gzip -v -k -r /home/baba

Output:
/home/baba/F_Input_File1.txt:    11.1% -- replaced with /home/baba/F_Input_File1.txt.gz
/home/baba/F_Input_File2.txt:    51.6% -- replaced with /home/baba/F_Input_File2.txt.gz
/home/baba/Imp/F_Input_File1.txt:       -12.5% -- replaced with /home/baba/Imp/F_Input_File1.txt.gz

$ ls -ltr
total 7
-rw-r--r--  1 baba None   52 Jan 26 12:50 F_Input_File1.txt.gz
-rw-r--r--  1 baba None   36 Jan 26 12:50 F_Input_File1.txt
-rw-r--r--  1 baba None  604 Jan 26 13:08 F_Input_File2.txt.gz
-rw-r--r--  1 baba None 1206 Jan 26 13:08 F_Input_File2.txt
drwxr-xr-x+ 1 baba None    0 Jan 26 13:24 Imp

How To Do Fast Zipping/Compression

$ gzip -v -1 F_Input_File1.txt
[OR]
$ gzip -v --fast F_Input_File1.txt

How To Do best Zipping/Compression

$ gzip -v -9 F_Input_File1.txt
[OR]
$ gzip -v --best F_Input_File1.txt

-1 or --fast indicates the fastest compression method (less compression) and -9 or --best indicates the slowest compression method (best compression). The default compression level is -6 (that is, biased towards high compression at expense of speed).

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<CATEGORY:2>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Unzip/Decompress The Zipped file

We have a separate command, gunzip, to unzip the files but we can do the same by using gzip itself. We will see gunzip as well.

$ gzip -d F_Input_File.txt.gz
$ gzip --decompress F_Input_File.txt.gz
$ gzip --uncompress F_Input_File.txt.gz

Output: All the 3 will unzip the file.

Unzip The Zipped Files Recursively

$ gzip -v -d -r /home/baba

Output: -d switch will unzip, -r switch will perform recursion, -v verbose
/home/baba/F_Input_File1.txt.gz:         11.1% -- replaced with /home/baba/F_Input_File1.txt
/home/baba/F_Input_File2.txt.gz:         51.6% -- replaced with /home/baba/F_Input_File2.txt
/home/baba/Imp/F_Input_File1.txt.gz:    -12.5% -- replaced with /home/baba/Imp/F_Input_File1.txt

Statistics Of Zipped File

$ gzip -v -l F_Input_File.txt.gz

Output: Will lsit down the statistics of zipped file.
method  crc     date  time           compressed        uncompressed  ratio uncompressed_name
defla 69cf461f Jan 26 12:50                  52                  36  11.1% a

How To Read A Zip/Compressed File

$ gcat F_Input_File1.txt.gz

Output: Will display the content of the file F_Input_File1.txt

How To Check Whether File Is Compressed Correctly

$ gzip -v -t F_Input_File1.txt.gz

Output: -t switch is used to test the integrity.
F_Input_File1.txt.gz:    OK

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<CATEGORY:3>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

We can use Most of the switches of gzip in gunzip will see the same in upcoming example.

Unzip/ Decompress The Zipped File using gunzip 

$ gunzip F_Input_File1.txt.gz
$ gunzip -v F_Input_File1.txt.gz

Output: Will unzip the file
F_Input_File1.txt

$ gunzip -d F_Input_File1.txt.gz
$ gunzip -v -d F_Input_File1.txt.gz

Output: -d switch will decompress the zip file, we have already seen the same with gzip file.
F_Input_File1.txt

Unzip/ Decompress The Zipped File Recursively using gunzip 

$ gunzip -r /home/baba/
$ gunzip -v -r /home/baba/

Output: Will unzip/decompress all the zipped files recursively.

Unzip/ Decompress The Zipped File, keeping the original zipped files.

$ gunzip -k F_Input_File1.txt.gz F_Input_File2.txt.gz
$ gunzip -v -k F_Input_File1.txt.gz F_Input_File2.txt.gz

Output: Will unzip the files as well as keep the copy of zip files. 

Note: If you observe we gzip and gunzip are very similar in functionality as well as in working. We should always use -v (verbose option), it provides the clear picture about what is going on.

Conclusion: Go with anyone gzip or gunzip, They are almost doing same kind of thing so why to use both, master one!!!

Keep Reading, Keep Learning, keep Sharing...!!!

Saturday, 23 January 2016

How To Insert, Append, Replace Lines In A File Using Sed

Some time we face situation where we need to either insert some records or line separator  in file or need to change the complete line which matches the criteria.

Let See how to implement the same using sed command.

Input_File: F_Input_File
1001|A00121|NOKIA-1100|Y

SYNTAX:
sed '<N/Pattern> [a/i/c] <Line To Be Appended /Inserted /Replaced>' F_Input_File.txt

N -> Line Number After Which File will be appended.
a -> Append After the Matching line or line number
i -> Insert Before the Matching line or line number
c -> change the matching line or line number
Pattern-> Pattern Matching/Regular Expression


Append: Add A line After The Match 

$ sed '1 a "*************************"' F_Input_File.txt
[OR]
$ sed '/A00121/ a "*************************"' F_Input_File.txt

Output:
1001|A00121|NOKIA-1100|Y
*************************

Insert: Add A line Before The Match

$ sed '1 i "-------------------------"' F_Input_File.txt
[OR]
$ sed '/A00121/ i "-------------------------"' F_Input_File.txt

Output:
-------------------------
1001|A00121|NOKIA-1100|Y

Change: Change Matching Line(s)

$ sed '1 c "****PRODUCT OUT OF STOCK**"' F_Input_File.txt
[OR]
$ sed '/A00121/ c "****PRODUCT OUT OF STOCK**"' F_Input_File.txt

Output:
****PRODUCT OUT OF STOCK**"


The above all method will display the updated/changed data on terminal but the changes will not be permanent in the original file.To make changes permanent, we need to use -i switch, as shown below:

$ sed -i '/A00121/ a "*************************"' F_Input_File.txt
$ sed -i '/A00121/ i "-------------------------"' F_Input_File.txt
$ sed -i '/A00121/ c "***PRODUCT OUT OF STOCK**"' F_Input_File.txt


And again, if your version of sed is not supporting -i switch then we can always go the basics to achieve the same using redirection operator. Already explained in following post

Keep Reading, Keep Learning, Keep Sharing..!!!

How To Update A File Keeping Original File Using Sed In Unix/Linux

In this article we will see how to update file using sed instead of using temporary table or displaying data at terminal. We will see how can we keep the backup of original file, that can be used in case of any failure or data retrieval. We will use input file F_Input_File.txt for the same. Let see how to achieve it...!!

$ cat F_Input_File.txt
1001|A1|HR|5000|Y
1002|A2|FIN|6000|Y
1003|A3|HR|6000|Y
1004|A4|HR|7000|Y

$ sed -i .bak 's/FIN/HR' F_Input_File.txt
[OR]
$ sed -i.bak 's/FIN/HR' F_Input_File.txt


Output: It Will produce two files in the same directory, one with updated data (in Green) and one with original data (.bak file,  in blue)

$ ls
F_Input_File.txt
F_Input_File.txt.bak

$ cat F_Input_File.txt
1001|A1|HR|5000|Y
1002|A2|HR|6000|Y
1003|A3|HR|6000|Y
1004|A4|HR|7000|Y 

$ cat F_Input_File.txt.bak
1001|A1|HR|5000|Y
1002|A2|FIN|6000|Y
1003|A3|HR|6000|Y
1004|A4|HR|7000|Y

Note: In some of the Linux/Unix system first command may not work, in that case please use 2nd command. (there should not be any space between -i switch and extension)

Most Imp: If NO extension is given, no backup will be saved.

$ sed -i 's/FIN/HR' F_Input_File.txt

Output: Will update the file F_Input_File.txt and there will be no file available with original data (backup file)

If -i switch is not working in your version of Unix/linux, we can always go to the basics and achieve the same by using redirection.

$ sed 's/FIN/HR' F_Input_File.txt > F_Input_File.txt_tmp


Updated data is now available in F_Input_File.txt_tmp, we can rename the original file (F_Input_File.txt) to backup (F_Input_File.txt.bkp) and then rename the temporary file (updated file:F_Input_File.txt_tmp,) to original file(F_Input_File.txt).

$ mv F_Input_File.txt F_Input_File.txt.bkp
$ mv F_Input_File.txt_tmp F_Input_File.txt


Keep Reading, Keep Learning, Keep Sharing...!!!

How To Replace Text In Specific Line Using Sed In Unix/Linux

In this article we will see how to replace a string/word in a specific line(s) using sed command. We will use F_Input_File.txt for explanation:

Input File: F_Input_File.txt
1001|A1|JPN|5000|Y
1002|A2|USA|6000|Y
1003|A3|UK|6000|Y
1004|A4|GER|7000|Y
1005|A5|CAN|8000|Y

Replacement In Nth Line

$ sed '5 s/Y/N/' F_Input_File.txt

Output: Will replace Y with N in 5th line only.
1001|A1|JPN|5000|Y
1002|A2|USA|6000|Y
1003|A3|UK|6000|Y
1004|A4|GER|7000|Y
1005|A5|CAN|8000|N

Replacement In Specific Lines ( Between Mth, Nth)

$ sed '3,5 s/Y/N/' F_Input_File.txt
[OR]
$ sed '3,5 s/Y/N/g' F_Input_File.txt
[OR]
$ sed '3,$ s/unix/linux/g' F_Input_File.txt


Output: All Will replace Y with N from line number 3 to 5 only. $ represent last line of the file.
1001|A1|JPN|5000|Y
1002|A2|USA|6000|Y
1003|A3|UK|6000|N
1004|A4|GER|7000|N
1005|A5|CAN|8000|N

Replacement Based On Pattern (or) Word/String Matching

$ sed '/USA/ s/Y/N/' F_Input_File.txt

Output: It will replace Y with N in the line(s) having USA.
1001|A1|JPN|5000|Y
1002|A2|USA|6000|N
1003|A3|UK|6000|Y
1004|A4|GER|7000|Y
1005|A5|CAN|8000|Y


Note: All the above listed solution will not change the data in file, sed will display the updated data at console only.If you want to update the data in file then redirect it in a temp file and use that file.

Keep Reading, Keep Learning, Keep Sharing...!!

Friday, 22 January 2016

Split Nth Column of A CSV File Into Multiple Columns

Scenario 1: Where Each Column Is Enclosed With In The Double Quotes(")

Input File: F_Input_File.txt
"1001/A1/HR","Developer/Unix","50000/USA"

Output File: F_Output_File.txt
"1001/A1/HR","Developer/Unix","50000","USA"

We want to break 3rd column of the csv (, delimited) file into 2 columns hence output file will have 4 columns ( initially it is 3)

$ awk -F"," 'BEGIN{OFS=","}{ gsub("/","\",\"",$3);}1' F_Input_File.txt > F_Output_File.txt

[OR]

$ awk -F"," 'BEGIN{OFS=","}{ gsub("/","\",\"",$3);print}' F_Input_File.txt > F_Output_File.txt

[OR]

$ awk -F"," 'BEGIN{OFS=","}{ gsub("/","\",\"",$3);print $0}' F_Input_File.txt > F_Output_File.txt

Output:
"1001/A1/HR","Developer/Unix","50000","USA"

Explanation:
 
      SYNTAX: gsub("<reg_exp/ToBeSearched>","<Replacement>","<String>")
  • gsub function is used to replace all the occurrence of forward slash(/) with comma(,) in 3rd column ($3)
  • As we need to enclosed newly created columns in double quotes(") hence has used backward slash(\), escape character (In Green), in 2nd attribute of gsub function to escape double quote in red colour. If we do not put escape character then gsub will raise invalid number of argument error. Double quotes in Pink colour are regular quotes.
  • 1 is used to print the complete line so does print and print $0 as shown in 3 different ways.

Scenario 2: Where Each Column Is Not Enclosed With In The Double Quotes(")

Input File: F_Input_File.txt
1001/A1/HR,Developer/Unix,50000/USA

Output File: F_Output_File.txt
1001/A1/HR,Developer/Unix,50000,USA

$ awk -F"," 'BEGIN{OFS=","}{gsub("/",",",$3);}1' F_Input_File.txt > F_Output_File.txt

[OR]

$ awk -F"," 'BEGIN{OFS=","}{gsub("/",",",$3);print}' F_Input_File.txt > F_Output_File.txt

[OR]

$ awk -F"," 'BEGIN{OFS=","}{ gsub("/",",",$3);print $0}' F_Input_File.txt > F_Output_File.txt

Output:
1001/A1/HR,Developer/Unix,50000,USA

Explanation: 
We have used the same logic, the only difference is the backward slash(\), escape character(in Green) is not used in 2nd parameter of gsub as this time we do not need to enclose our new columns in double quote(")

Keeping Reading, Keeping Learning, Keeping Sharing.....!!!

Tuesday, 19 January 2016

How To Convert Fixed Width File To Delimited Using SED In Unix/Linux

In this article, we will see how to convert a fixed width file into a delimited file using sed command. We will see step by step how sed perform this operation. Please go through with the following post How To Use & In Sed Command In Unix/Linux , if you have not.
We have a file F_Input_File.txt which have fixed width data, we have to allocate this data across 5 columns:
Col1: From 1 to 4
Col2: From 5 to 6
Col3: From 7 to 8
Col4: From 9 to 13
Col5: From 14 to 16
Input File: F_Input_File.txt
1001A1HR50000USA
Solution:
$ sed -e 's/./&,/4' -e 's/./&,/7' -e 's/./&,/10' -e 's/./&,/16' F_Inout_File.txt
Output:
1001,A1,HR,50000,USA
What is that? What exactly has happened? Let’s understand it.
Explanation: 
1. We can execute multiple sed command using -e switch.
2. dot (.) matches any number of character.

1st -e switch will read "1001A1HR50000USA" as input:
$ sed -e 's/./&,/4'
What is happening here ??
1.   Dot (.) matches everything until 4th occurrence of any character.
If you remember the syntax:
sed 's/reg_exp/replacement/[occurrence]' F_Input_File    
Occurrence
1
2
3
4
Data
1
0
0
1
REG_EXP
.
.
.
.

     2.  & will hold 1001
          Result of REG_EXP (.) = 1001
          Replacement (&,)=1001,
          After the 2nd -e switch record will be: 1001,A1HR50000USA

2nd -e switch will receive input as "1001,A1HR50000USA"
$ sed -e 's/./&,/7'
1.   dot (.) will match until 7th occurrence of any character.
Occurrence
1
2
3
4
5
6
7
Data
1
0
0
1
,
A
1
REG_EXP
.
.
.
.
.
.
.

2.   & will hold 1001,A1
Result of REG_EXP (.) = 1001,A1
Replacement (&,) =1001,A1,
After the 2nd -e switch record will be: 1001,A1,HR50000USA

3rd -e switch will receive input as "1001,A1,HR50000USA"
$ sed -e 's/./&,/10'
1.   Again, (dot) will match until 10th occurrence of any character.
Occurrence
1
2
3
4
5
6
7
8
9
10
Data
1
0
0
1
,
A
1
,
H
R
REG_EXP
.
.
.
.
.
.
.
.
.
.

2.   & will hold 1001,A1,HR
Result of REG_EXP (.) = 1001,A1,HR
Replacement (&,) =1001,A1,HR,          
After the 3rd -e switch record will be: 1001,A1,HR,50000USA

4th -e switch will receive input as "1001,A1,HR,50000USA"
$ sed -e 's/./&,/10'
1.  Dot(.) will match until 16th occurrence of any character.
Occurrence
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Data
1
0
0
1
,
A
1
,
H
R
,
5
0
0
0
0
REG_EXP
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

 2. & will hold 1001,A1,HR,50000USA
Result of REG_EXP (.) = 1001,A1,HR,50000
Replacement (&,) =1001,A1,HR,50000,
After the 4th -e switch record will be: 1001,A1,HR,50000,USA

Exactly what we want…!!!
Conclusion: We have already seen the same functionality using awk command, you may also check that out How to Convert Fixed Width File to Delimited Using AWK& is a very powerful switch available in sed command to format the data. 
Keep Reading, Keep Learning, Keep Sharing...!!!
Related Posts Plugin for WordPress, Blogger...