29 October, 2012

Making regular expressions more readable in shell scripts

Today I was wondering why using sed in shell scripts feels so awkward.
I wanted to fix that and posted a question on stackoverflow.

First improvement: You can use custom separators to avoid escaping /

With that,
sed 's/\/\*\s*APPEND_NAME.*\s*\*\/\(.*\)\/\*\s*.*\s*\*\/:\1'$value':g'
sed 's:/\*\s*APPEND_NAME.*\s*\*/\(.*\)/\*\s*.*\s*\*/:\1'$value':g'

A bit more readable,  but still I was not happy and wanted to got a little further. Here is my solution for replacing several kinds of comments/annotations in a template file. The file looks like this (excerpt):
/*!ONCE*/IMPORT 'schemas.ccl';
/*!ONCE*/SCHEMA TickSchema PRIMARY KEY (t_key);

KEEP             /*WINDOW_FULLSIZE */11000     /* */ MILLISECONDS AS
SELECT * FROM SimpleTick st
I want to replace the /*COMMENTS*/ with real values, which is actually quite easy using cat, grep, and sed. But the many sed expressions needed for that task would still contain many escaped characters, such as \*, \s, and \(, \); even when using colons instead of slashes for expression separators.

Second improvement: Outsource all redundancy!
By writing common sub expressions in variables or even functions, and composing the expressions from them, you gain much more readable sed statements. Here is my full sed/grep readability approach:

    echo "$CS$*.*$CE"

    local name=$1
    local value=$2
    sed "s:$(wrapex $name):$value:g"
    local name=$1
    local value=$2
    sed "s:$(wrapex $name):'$value':g"

    local name=$1
    local value=$2
    sed "s:$(wrapex $name)$REF$(wrapex):\1$value:g"

    egrep "!ONCE" | sed "s:$(wrapex "!ONCE")$REF:\1:g"

    sed '\:'$CS'#:d' | sed '\:#'$CE':d' | sed '\:\s*\*#:d'
In the end, using such sed helper functions is quite easy and the code stays readable:
replaceVariable WINDOW_FULLSIZE 100
appendVariable  APPEND_NAME     "_A"


25 October, 2012

SSH without password - it's so simple

I am currently dealing with several VMs in Linux :-) and Windows  :-(
In Windows I use the Git-Bash to do most of my command line work.

To easily hop around the different machines via SSH without requiring a password, you can very easily authenticate the machines to access each other.

  1. Generate an ssh key (if not done already):
  2. Add current host's public key to remote host's authorized keys:
    cat .ssh/ | ssh user@myvm 'cat >> .ssh/authorized_keys'
  3. Done! You can now connect to the machine from the current host:
    ssh user@myvm
No password required anymore. You need to redo this for every new A -> B and B -> A pair of machines you have. Of course, I do not have to tell you that this introduces potential security issues in your infrastructure. Please be careful when dealing with a production environment.


20 July, 2012

Are you missing the ternary operator in CoffeeScript?

CoffeeScript does not have the ternary operator because the question mark is used for null checks:
if value? then doA() else doB()
Instead of using a ternary, e.g., for conditional assignments, in CoffeeScript you could write:
result = if value > 10 then getA() else getB()
A JavaScript native alternative to the ternary, which also works in CoffeeScript, is the guard syntax:
result = value > 10 && getA() || getB()
If the "guard" before an AND fails, the expression after the AND will not be executed. The whole AND expression will fail then and the OR expression will be considered. And that's it, there you have your new ternary syntax.

But there is more about these AND/OR mechanics. A consecutive OR expression is executed if the previous expression fails (returns a "falsy" value). This also allows for result-dependent sequential execution, such as used when combining sort functions.
sort1 = (a,b) -> byName(a,b)  || byValue(a,b) 
sort2 = (a,b) -> byValue(a,b) || byName(a,b)

data = [
   { name: "a", value: 10 }
   { name: "b", value: 5  }
   { name: "c", value: 5  }

data.sort sort1 # result: a,b,c
data.sort sort2 # result: b,c,a

Please pay attention to how the guard works. In the above example, if the guard evaluates to true AND getA() returns a "falsy" value, then getB() will be returned. This is different from how a real if-then-else or the JavaScript ternary operator works. Falsy values are:
false, 0, "", null, undefined

Instead of using the ternary, either use plain if-then-else in one line or use the guard syntax, but be aware of "falsy" values returned from the guarded expression.

Ciao, Juve

14 June, 2012

Solved the Google Blockly maze!

Yesterday I read about an MIT Scratch clone: google-blockly.
They have some nice demo programs and one involves solving a maze puzzle.

Instead of eating some healthy food during my lunch break I had a lot of fun creating my solution to the puzzle. Here it is:

I also tried to switch all "left" and "right" and it still works - yeah!
Do you know better (shorter) solutions?

-- Juve

03 March, 2012

Simple Session Management for Jetty Websockets

For tracking my clients websockets on the server, I need to store the session id for each HTTP connection. Otherwise I can't  distinguish the clients, as soon as they open more than one websocket.

My current websocket backend is a very simple Jetty-based Java application. When trying to get the HTTP session id in my websocket handler, I got a No SessionManager found error. Browsing the web provided several complex solutions [1] [2] that I could reduce this very simple session handling setup, using jetty's reasonable defaults.
Server server     = new Server(8001);

SessionHandler sh = new SessionHandler(); // org.eclipse.jetty.server.session.SessionHandler
sh.setHandler(websocketHandler);          // wrap websocket handler in session handler

server.setHandler(sh);                    // set session handler as jettys default handler
Adding just two lines of code and changing the server's default handler I got basic session handling in my jetty app. Very nice.


26 February, 2012

CSS 3D Text-Shadow Magic with Less.js

I just found Mark Ottos CSS-based 3D-Text. His CSS class is fixed to a certain font-size, and I decided to create a Less-class for this. Here it is:
.shadow-3d (@color#fff, @size: 20px) {
    @o: @size/80;
    @c: @color - 48;
    color: @color;
    font-size: @size;
        0 @o    0 @c - 8,
        0 @o*2  0 @c - 16,
        0 @o*3  0 @c - 24,
        0 @o*4  0 @c - 32,
        0 @o*5  0 @c - 48,
        0 @o*6  @o    rgba(0,0,0,.1),
        0 0     @o*5  rgba(0,0,0,.1),
        0 @o*1  @o*3  rgba(0,0,0,.3),
        0 @o*3  @o*5  rgba(0,0,0,.2),
        0 @o*5  @o*10 rgba(0,0,0,.25),
        0 @o*10 @o*10 rgba(0,0,0,.2),
        0 @o*20 @o*20 rgba(0,0,0,.15);

And here you can see it in action (via Less-compiled CSS-classes):

Big Text

Small Text

Now isn't that fancy?
Stay tuned for more magic with modern web-technologies!

-- Juve

23 February, 2012

I just can't remember bash/batch Syntax - CoffeeScript to the rescue!

I regularly need some scripts for copying files to folders or remote servers, for creating folders, checking file existence, etc. The scripts should also be able to accept parameters. Often I try to write a bash script for these tasks, but it always takes much longer than expected. Similarly, on Windows, I do the same with batch/cmd-files, whose syntax is not much better to remember than bash.

I will not do this anymore, it is too cumbersome for me!

From now on, I will use a language I know very well: JavaScript!. Might sound strange but it is actually quite easy. I use node.js as "interpreter" and call the OS functions via the child_process or fs modules.  Esp. error checking is much more versatile now!
Combine that with CoffeeScript, and take shell/batch-scripting tasks to the next level, including easy async. task execution with callbacks! Thanks to CoffeeScript, I get rid of the bad parts of JavaScript. Here is a simple copy script as example (replace the echo cmd with a copy command of your choice)
fs     = require("fs")
exec   = require("child_process").exec
files  = process.argv.slice 2
target = files.pop()     #last arg is target dir/name

stop_onerror = true
error_count  = 0

copy_next = ->
    if files.length == 0 then stop()
        file = files.pop()
        exec "echo copying #{file} to #{target}", (error, out) ->
            console.log out.trim()
            if error?
                console.error "ERROR(#{error_count++})! copy of #{file} failed"
            if error? && stop_onerror then stop() else copy_next()

stop = ->
    if error_count > 0
        console.log "Files copied (with #{error_count} errors)."
        console.log "All files copied successfully."

usage = -> console.log '''
    Usage (with node)
        node   script.js     file1 file2 file3 ... target

    Usage (with coffee)
        coffee file1 file2 file3 ... target

if files.length > 0 && target? then fs.stat target, (error, fstat) ->
    if error? then console.log "#{target} not found!"
    else console.log "copying #{files} to #{target}"; copy_next()
else usage()
This basic script just mimics a normal copy command and is therefore rather useless. But it shows that you can easily combine node.js' file-system functions with operating system calls via exec, and that it is really easy to add more features, such as automatic target-dir creation, file-renames, error logging, etc. Esp. such error checking and other conditional stuff is really cumbersome in plain bash/batch scripting.

Ciao, Juve

Edit: Using bash a lot in the past years, I am now happy with it and actually like it. I also do not have a Windows PC/Laptop anymore and can often safely rely on the *nix commands: cp, mv, rmdir, mkdir to deal with files. However, For real software development, you should better conduct file-related tasks using build and deployment scripts, e.g., using Grunt.

19 February, 2012

To prototype or not to prototype?

I am using jsperf a lot lately. Here is another interesting result:

Prototype-based object creation is lightning fast, compared to returning plain objects from simple functions. (see my test on jsperf)

But using prototype-based stuff usually forces you to access properties via this
which is an additional lookup in the prototype chain, that will add some overhead when accessing these object a lot. (see my other test on jsperf)

For you as a JavaScript developer, this means you have to consider the following.

  • Is my code creating a lot of (similar) objects? Then you should setup some reusable types by using prototypes.
  • Is my code accessing properties of some of my objects a lot? Then you should get rid of the access via this and add some for form of property caching in local variables.