linux - Bash, Remove empty XML tags -



linux - Bash, Remove empty XML tags -

i need help couple of questions, using bash tools

i want remove empty xml tags file eg: class="lang-xml prettyprint-override"> <createofficecode> <operatorid>ve</operatorid> <officecode>1234</officecode> <countrycodelength>0</countrycodelength> <areacodelength>3</areacodelength> <attributes></attributes> <chargearea></chargearea> </createofficecode>

to become:

class="lang-xml prettyprint-override"> <createofficecode> <operatorid>ve</operatorid> <officecode>1234</officecode> <countrycodelength>0</countrycodelength> <areacodelength>3</areacodelength> </createofficecode>

for have done command

sed -i '/><\//d' file

which not strict, more trick, more appropriate find <pattern></pattern> , remove it. suggestion?

second, how go from: class="lang-xml prettyprint-override"> <createofficegroup> <createofficename>john</createofficename> <createofficecode> </createofficecode> </createofficegroup>

to:

class="lang-xml prettyprint-override"> <createofficegroup> <createofficename>john</createofficename> </createofficegroup> as whole thing? from: class="lang-xml prettyprint-override"> <createofficegroup> <createofficename>john</createofficename> <createofficecode> <operatorid>ve</operatorid> <officecode>1234</officecode> <countrycodelength>0</countrycodelength> <areacodelength>3</areacodelength> <attributes></attributes> <chargearea></chargearea> </createofficecode> <createofficesize> <chairs></chairs> <tables></tables> </createofficesize> </createofficegroup>

to:

class="lang-xml prettyprint-override"> <createofficegroup> <createofficename>john</createofficename> <createofficecode> <operatorid>ve</operatorid> <officecode>1234</officecode> <countrycodelength>0</countrycodelength> <areacodelength>3</areacodelength> </createofficecode> </createofficegroup>

can reply questions individuals? give thanks much!

xmlstarlet command-line xml processor. doing want one-line operation (until desired recursive behavior added), , work variants of xml syntax describing same input:

the simple version:

xmlstarlet ed \ -d '//*[not(./*) , (not(./text()) or normalize-space(./text())="")]' \ input.xml

the fancy version:

strip_recursively() { local doc last_doc ifs= read -r -d '' doc while :; last_doc=$doc doc=$(xmlstarlet ed \ -d '//*[not(./*) , (not(./text()) or normalize-space(./text())="")]' \ /dev/stdin <<<"$last_doc") if [[ $doc = "$last_doc" ]]; printf '%s\n' "$doc" homecoming fi done } strip_recursively <input.xml

/dev/stdin used rather - (at cost platform portability) improve portability across releases of xmlstarlet; adjust taste.

with scheme having older dependencies installed, more xml parser have installed bundled python.

#!/usr/bin/env python import xml.etree.elementtree etree import sys doc = etree.parse(sys.stdin) def prune(parent): ever_changed = false while true: changed = false el in parent.getchildren(): if len(el.getchildren()) == 0: if ((el.text none or el.text.strip() == '') , (el.tail none or el.tail.strip() == '')): parent.remove(el) changed = true else: changed = changed or prune(el) ever_changed = changed or ever_changed if changed false: homecoming ever_changed prune(doc.getroot()) print etree.tostring(doc.getroot())

xml linux bash sed

Comments

Popular posts from this blog

xslt - DocBook 5 to PDF transform failing with error: "fo:flow" is missing child elements. Required content model: marker* -

mediawiki - How do I insert tables inside infoboxes on Wikia pages? -

Local Service User Logged into Windows -