Making regular expressions more readable in shell scripts

Today I was wondering why using sed in shell scripts feels so awkward.
I wanted to fix that and posted a question on stackoverflow.

First improvement: You can use custom separators to avoid escaping /

With that,
sed 's/\/\*\s*APPEND_NAME.*\s*\*\/\(.*\)\/\*\s*.*\s*\*\/:\1'$value':g'
becomes
sed 's:/\*\s*APPEND_NAME.*\s*\*/\(.*\)/\*\s*.*\s*\*/:\1'$value':g'

A bit more readable,  but still I was not happy and wanted to got a little further. Here is my solution for replacing several kinds of comments/annotations in a template file. The file looks like this (excerpt):
/*!ONCE*/IMPORT 'schemas.ccl';
/*!ONCE*/CREATE INPUT WINDOW SimpleTick
/*!ONCE*/SCHEMA TickSchema PRIMARY KEY (t_key);

CREATE WINDOW    /*APPEND_NAME     */TickFilter/* */ PRIMARY KEY DEDUCED
KEEP             /*WINDOW_FULLSIZE */11000     /* */ MILLISECONDS AS
SELECT * FROM SimpleTick st
I want to replace the /*COMMENTS*/ with real values, which is actually quite easy using cat, grep, and sed. But the many sed expressions needed for that task would still contain many escaped characters, such as \*, \s, and \(, \); even when using colons instead of slashes for expression separators.

Second improvement: Outsource all redundancy!
By writing common sub expressions in variables or even functions, and composing the expressions from them, you gain much more readable sed statements. Here is my full sed/grep readability approach:
CS='/\*\s*'
CE='\s*\*/'
REF='\(.*\)'

wrapex(){
    echo "$CS$*.*$CE"
}

replaceVariable(){
    local name=$1
    local value=$2
    sed "s:$(wrapex $name):$value:g"
}
replaceString(){
    local name=$1
    local value=$2
    sed "s:$(wrapex $name):'$value':g"
}

appendVariable(){
    local name=$1
    local value=$2
    sed "s:$(wrapex $name)$REF$(wrapex):\1$value:g"
}

applyTemplateHeader(){
    egrep "!ONCE" | sed "s:$(wrapex "!ONCE")$REF:\1:g"
}

removeComments(){
    sed '\:'$CS'#:d' | sed '\:#'$CE':d' | sed '\:\s*\*#:d'
}
In the end, using such sed helper functions is quite easy and the code stays readable:
replaceVariable WINDOW_FULLSIZE 100
appendVariable  APPEND_NAME     "_A"

Cheers,
Juve

Comments