Remove multiple segments from XML File Using DOS

DOS Commands to remove multiple segments(Header) from a XML file in Datastage(Installed on Windows Server)?

Requirement#1: To remove multiple segments(Header)[Highlighted in red] from a XML file in Datastage(Installed on Windows Server)?

Solution:
Example details:
File Name: Test1.xml
File_Path: \\Servername\ABC\XYZ\OUT\

Sample data in Test1.xml:

<Transaction>
<Segment-IP>...</Segment-IP>
<Segment-ST>...</Segment-ST>
<Loop-100A>...</Loop-100A>
</Transaction>
<A>...</A>
<B>...</B>
<Transaction>
<Segment-IP>...</Segment-IP>
<Segment-ST>...</Segment-ST>
<Loop-100A>...</Loop-100A>
</Transaction>
<C>...</C>
<D>...</D>
<E>...</E>

Step#1:Take the first occurrence of the repeating segment(Header) into a new file:-

Command Used:
egrep 'Transaction' #File_Path#\Test1.xml | head -1 > #File_Path#\Test3.xml &&
egrep 'Segment-IP' #File_Path#\Test1.xml | head -1 >> #File_Path#\Test3.xml &&
egrep 'Segment-ST' #File_Path#\Test1.xml | head -1 >> #File_Path#\Test3.xml &&
egrep 'Loop-100A' #File_Path#\Test1.xml | head -1 >> #File_Path#\Test3.xml &&
egrep 'Loop-100B' #File_Path#\Test1.xml | head -1 >> #File_Path#\Test3.xml &&
egrep '\/Transaction' #File_Path#\Test1.xml | head -1>> #File_Path#\Test3.xml &&
echo "Header(First occurrence) moved to a separate file"

General usage of this command:
egrep: To grep specific string in a file.
head -1: Used to fetch the top most records from the file.
&&:  Used as a separator to execute multiple commands in DOS.

Output stored in Test3.xml:
<Transaction>
<Segment-IP>...</Segment-IP>
<Segment-ST>...</Segment-ST>
<Loop-100A>...</Loop-100A>
</Transaction>

Step#2: Delete all the occurrence of repeating header segments from the original file and append the remaining data in the output file of Step#1:

Command Used:
sed -e '/Transaction/d;/Segment-IP/d;/Segment-ST/d;/Loop-100A/d;/Loop-100B/d;/\ /Transaction/d'  #File_Path#\Test1.xml >>  #File_Path#\Test3.xml &&
echo "All Header segments deleted and data appended to final outfile"

General usage of this command: 
Used to delete multiple (strings/patterns) from a file using single "sed" command with ";" as a separator.

Final Output stored in Test3.xml:
<Transaction>
<Segment-IP>...</Segment-IP>
<Segment-ST>...</Segment-ST>
<Loop-100A>...</Loop-100A>
</Transaction>
<A>...</A>
<B>...</B>
<C>...</C>
<D>...</D>
<E>...</E>

Thanks!
Achal

No comments:

Post a Comment