How To Convert Fixed Width File To Delimited Using SED In Unix/Linux

January 19, 2016

How To Convert Fixed Width File To Delimited Using SED In Unix/Linux

In this article, we will see how to convert a fixed width file into a delimited file using sed command. We will see step by step how sed perform this operation. Please go through with the following post How To Use & In Sed Command In Unix/Linux , if you have not.

We have a file F_Input_File.txt which have fixed width data, we have to allocate this data across 5 columns:

Col1: From 1 to 4

Col2: From 5 to 6

Col3: From 7 to 8

Col4: From 9 to 13

Col5: From 14 to 16

Input File: F_Input_File.txt

1001A1HR50000USA

Solution:

$ sed -e 's/./&,/4' -e 's/./&,/7' -e 's/./&,/10' -e 's/./&,/16' F_Inout_File.txt

Output:

1001,A1,HR,50000,USA

What is that? What exactly has happened? Let’s understand it.

Explanation:

1. We can execute multiple sed command using -e switch.

2. dot (.) matches any number of character.

1st -e switch will read "1001A1HR50000USA" as input:

$ sed -e 's/./&,/4'

What is happening here ??

1. Dot (.) matches everything until 4th occurrence of any character.

If you remember the syntax:
sed 's/reg_exp/replacement/[occurrence]' F_Input_File

Occurrence	1	2	3	4
Data	1	0	0	1
REG_EXP	.	.	.	.

2. & will hold 1001

Result of REG_EXP (.) = 1001

Replacement (&,)=1001,
After the 2nd -e switch record will be: 1001,A1HR50000USA

2nd -e switch will receive input as "1001,A1HR50000USA"

$ sed -e 's/./&,/7'

1. dot (.) will match until 7th occurrence of any character.

Occurrence	1	2	3	4	5	6	7
Data	1	0	0	1	,	A	1
REG_EXP	.	.	.	.	.	.	.

2. & will hold 1001,A1

Result of REG_EXP (.) = 1001,A1

Replacement (&,) =1001,A1,
After the 2nd -e switch record will be: 1001,A1,HR50000USA

3rd -e switch will receive input as "1001,A1,HR50000USA"

$ sed -e 's/./&,/10'

1. Again, (dot) will match until 10th occurrence of any character.

Occurrence	1	2	3	4	5	6	7	8	9	10
Data	1	0	0	1	,	A	1	,	H	R
REG_EXP	.	.	.	.	.	.	.	.	.	.

2. & will hold 1001,A1,HR

Result of REG_EXP (.) = 1001,A1,HR

Replacement (&,) =1001,A1,HR,

After the 3rd -e switch record will be: 1001,A1,HR,50000USA

4th -e switch will receive input as "1001,A1,HR,50000USA"

$ sed -e 's/./&,/10'

1. Dot(.) will match until 16th occurrence of any character.

Occurrence	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16
Data	1	0	0	1	,	A	1	,	H	R	,	5	0	0	0	0
REG_EXP	.	.	.	.	.	.	.	.	.	.	.	.	.	.	.	.

2. & will hold 1001,A1,HR,50000USA

Result of REG_EXP (.) = 1001,A1,HR,50000

Replacement (&,) =1001,A1,HR,50000,

After the 4th -e switch record will be: 1001,A1,HR,50000,USA

Exactly what we want…!!!

Conclusion: We have already seen the same functionality using awk command, you may also check that out How to Convert Fixed Width File to Delimited Using AWK. & is a very powerful switch available in sed command to format the data.

Keep Reading, Keep Learning, Keep Sharing...!!!

Search This Blog

SHELL SCRIPTING

How To Convert Fixed Width File To Delimited Using SED In Unix/Linux

Comments

Post a Comment

Popular Posts

How To Replace Text Using Sed in Unix/Linux

How to Delete Lines Using Sed Command In Unix/Linux