Pages

14 February, 2011

Simplify SVG for HTML5 App Debugging

In this article I will show you a nice little shell script that will
  • Crunch your big Inkscape SVG,
  • Round all numbers to one digit,
  • Remove all unnecessary styles, namespaces, nodes, and attributes that have no effect in the browser
The script is a rough start, and I expect it to get better when I will further develop my little HTML5 scenario.

Why do I need the script?

I've been experimenting with HTML5, SVG, JavaScript, Firebug, and Chrome Developer Tools since last Christmas. Currently, I am developing a little proof of concept game. The idea is to:
  1. Use vanilla Inkscape as main world editor.
  2. Render the game world using SVG in the browser.
  3. Use some JavaScript/JSON DSL to model the game logic.
If you are debugging things in the browser, you want a small DOM tree and as few code lines as possible. The basic Inkscape SVGs are quite redundant and you often have a lot of decimals with lots of post-comma digits. For better debugging I will round them down/up. I also wanted to replace many inline styles with a real <style> tag, and remove all unnecessary nodes, attributes, etc.

Full SVG simplification script
# This file is provided as is, with no warranties
#
# Author: Uwe Jugel
# WWW:    http://open-juve.blogspot.com
#
# License: Creative Commons Attribution-ShareAlike 3.0 Unported License.
#
#!/bin/sh

echo cleaning up $1
#namespaces
svgns=http://www.w3.org/2000/svg
svgns_escaped="http:\/\/www.w3.org\/2000\/svg"
sodins=http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd
inkns=http://www.inkscape.org/namespaces/inkscape

#replace svg root
svgreplace='s/\(^<svg\)\(\s\).*>$/\1\2xmlns=\"'$svgns_escaped'\"\2xmlns:svg=\"'$svgns_escaped'\">/g'

defaultstyles=''

for i in \
's/color:#000000;*//g' \
's/stroke:#000000;*//g' \
's/stroke-width:1[a-Z]*;*//g' \
's/stroke-linecap:butt;*//g' \
's/stroke-linejoin:miter;*//g' \
's/stroke-opacity:1[a-Z]*;*//g' \
's/fill:none;*//g' \
's/fill-opacity:1[a-Z]*;*//g' \
's/fill-rule:nonzero;*//g' \
's/stroke-miterlimit:4[a-Z]*;*//g' \
's/stroke-dasharray:none;*//g' \
's/stroke-dashoffset:0[a-Z]*;*//g' \
's/marker:none;*//g' \
's/visibility:visible;*//g' \
's/display:inline;*//g' \
's/overflow:visible;*//g' \
's/enable-background:accumulate;*//g'
do
defaultstyles="$defaultstyles -e $i"
done

emptystyle='s/style=""//'

cat $1 > /tmp/svg-cleanup.tmp1
touch /tmp/svg-cleanup.tmp2
touch /tmp/svg-cleanup.tmp3
touch ./tiles-export.svg

out=./tiles-export.svg
tmp=/tmp/svg-cleanup.tmp1
tmpround=/tmp/svg-cleanup.tmp2
tmploop=/tmp/svg-cleanup.tmp3

recursiontest=1
loops=1

echo "starting recursive rounding"
while [ $recursiontest = 1 ]; do
for i in 's/\(\.[0-9]\)[0-4][0-9]*/\1/g' 's/\(\.[0-9]\)[5-9][0-9]*/\1_CEIL_/g' \
's/0_CEIL_/1/g' 's/1_CEIL_/2/g' 's/2_CEIL_/3/g' 's/3_CEIL_/4/g' 's/4_CEIL_/5/g' \
's/5_CEIL_/6/g' 's/6_CEIL_/7/g' 's/7_CEIL_/8/g' 's/8_CEIL_/9/g' \
's/\([0-9]\)9_CEIL_/\1_CEIL_0/' 's/\([0-9]\)\.9_CEIL_/\1_CEIL_\.0/'
do
cat $tmp | sed -e $i > $tmpround
cat $tmpround > $tmp
done
diff=$(diff -qn $tmp $tmploop)
recursiontest=$?
if [ $recursiontest = 1 ]; then echo "loop: $loops"; fi
loops=$(($loops+1));
cat $tmp > $tmploop
done
echo "recursive rounding done."

cat $tmp | xmlstarlet ed -N svg=$svgns -N sodi=$sodins -N ink=$inkns -d "//svg:metadata" -d "//sodi:*" \
-d "//@sodi:nodetypes" -d "//@ink:connector-curvature" -d "//@sodi:docname" | sed $defaultstyles -e $emptystyle \
-e $svgreplace > $out

The Script in Detail

First I define some namespaces that will be used by xmlstartlet to modify the xml tree:
#namespaces
svgns=http://www.w3.org/2000/svg
svgns_escaped="http:\/\/www.w3.org\/2000\/svg"
sodins=http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd
inkns=http://www.inkscape.org/namespaces/inkscape
Then I have a little replacement expression that will kill all namespaces from from the svg root tag (will be called last).
#replace svg root
svgreplace='s/\(^<svg\)\(\s\).*>$/\1\2xmlns=\"'$svgns_escaped'\"\2xmlns:svg=\"'$svgns_escaped'\">/g'
Inkscape writes many inline styles into each node in the SVG tree. Most of them are default values that are supported by the browser. Therefore, we can remove them (and manually replace them with a CSS file someday).
defaultstyles=''
for i in 's/color:#000000;*//g' 's/stroke:#000000;*//g' 's/stroke-width:1[a-Z]*;*//g' #and so on
do
defaultstyles="$defaultstyles -e $i"
done
If not style is left, I also have to remove the empty style attribute
emptystyle='s/style=""//'
Now comes a funny part (after some file initialization). I am using a while loop and a for loop to accomplish some basic rounding of all numbers in the file. Post-digits .X1 to .X4 are cut down to .X, while all numbers .X5 and above are rounded up including a distribution of the rounding to the previous digits. The solution using only sed and some loops looks awkward but it works.
while [ $recursiontest = 1 ]; do
for i in 's/\(\.[0-9]\)[0-4][0-9]*/\1/g' 's/\(\.[0-9]\)[5-9][0-9]*/\1_CEIL_/g' \
's/0_CEIL_/1/g' 's/1_CEIL_/2/g' 's/2_CEIL_/3/g' 's/3_CEIL_/4/g' 's/4_CEIL_/5/g' \
's/5_CEIL_/6/g' 's/6_CEIL_/7/g' 's/7_CEIL_/8/g' 's/8_CEIL_/9/g' \
's/\([0-9]\)9_CEIL_/\1_CEIL_0/' 's/\([0-9]\)\.9_CEIL_/\1_CEIL_\.0/'
do
cat $tmp | sed -e $i > $tmpround
cat $tmpround > $tmp
done
diff=$(diff -qn $tmp $tmploop)
recursiontest=$?
if [ $recursiontest = 1 ]; then echo "loop: $loops"; fi
loops=$(($loops+1));
cat $tmp > $tmploop
done
This script marks all places to be rounded up with _CEIL_ and then replaces these marks with new marks or the correct numbers until no marks are left.

Finally, I aggregate all sed expressions, use xmlstartlet to remove the bad nodes and attributes, and write out the result.
cat $tmp | xmlstarlet ed -N svg=$svgns -N sodi=$sodins -N ink=$inkns -d "//svg:metadata" -d "//sodi:*" \
-d "//@sodi:nodetypes" -d "//@ink:connector-curvature" -d "//@sodi:docname" | sed $defaultstyles -e $emptystyle \
-e $svgreplace > $out

Conclusion

I am not 100% sure if the tools (bash/sed/xmlstartlet) are the perfect fit for my task but they get the job done without being overly complex. If I imagine of how unreadable an XSLT script would be, compared to this little bash script, I believe that my choice was not unwise.

I hope you find this article helpful for you SVG-related project.

Best wishes,
Juve


Creative Commons License
SVG simplificaion shell script by Uwe Jugel is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

1 comment:

juve said...

There was a bug in
#replace svg root
svgreplace=...

It was caused by the blogspot auto-html-tagger. I replaced all < and > with &lt; and &gt;

Now it should work