May 2008
[ bamboo ] 20:06, Tuesday, 20 May 2008

Yeah, it's that time again!

As usual lots and lots of improvements in this new release including:

With many many kudos to Avishay Lavie, Cédric Vivier and Marcus Griep!

Cédric deserves special mention for working on all fronts and really pushing for the release. Superb work!

What? - http://boo.codehaus.org/

Download - http://boo.codehaus.org/Download

Official irc channel - irc://irc.codehaus.org/boo

Have fun!

[ bamboo ] 16:41, Tuesday, 20 May 2008

When I joined Db4objects a few years ago my first assignment was to research and implement a decent solution for getting db4o to work on the .net platform.

There was already some investigation going on on using the Eclipse JDT API as the basis for a source to source translator which proved to be a wise choice in the long run.

Eventually sharpen was born and after a lot of love we finally reached a point where the translated c# code would look really good.

People would get really interested every time sharpen was mentioned but for several reasons it wasn't publicly available.

Until now. Hooray!

I'm really looking forward to what people will build on top of that.


[ bamboo ] 13:33, Tuesday, 20 May 2008

Another great Ted talk. I particularly love the sequence from 7:08 to 7:40.

I'm the man. The final man. You know, we've been mutating for 4 billion years but now because it's me we've stopped.

Except this time is true. :)

[ bamboo ] 01:40, Friday, 9 May 2008

Almost four years ago the first feature request was entered into the boo issue tracker.

There should be a way to extend the parser so it could recognize custom measurement unit literals such as 1kg and 2cm.

Since then boo has improved a lot but with no solution for BOO-1 in the horizon.

I think I'm getting close to solve it in a interesting way using PEGs implemented as a graph of expression objects.

PEGs are very likable. Conceptually simple and composable.

Take the PEG that recognizes integer expressions involving the + and * operators:

grammar <- spaces addition eof
addition <- term ("+" spaces term)*
term <- factor ("*" spaces factor)*
factor <- [0-9]+ spaces
spaces <- (' ' / '\t')*
eof <- !.

where:

() means grouping.

* means zero or more matches.

+ means one or more matches.

/ is the prioritized choice operator. If the first expression succeeds, the whole expression succeeds. If the first expression fails, it backtracks and evaluates the second expression.

! is the not predicate operator which succeeds if its operand fails. It never consumes any input.

. matches any input.

The grammar can be translated to boo very simply using the peg macro from Boo.Pegs:

import Boo.Pegs

peg:
    grammar = spaces, addition, eof
    addition = term, --("+", spaces, term)
    term = factor, --("*", spaces, factor)
    factor = ++[0-9], spaces
    spaces = --(' ' / '\t')
    eof = not any()
    
assert grammar.Match(PegContext("  6*6 + 6 "))

I had to be a little creative in mapping the PEG operators to valid boo expressions because as it must be clear by now boo doesn't allow the introduction of completely new syntax and that's what the fuss is all about here.

I actually like the way it looks.

Implementing something more useful such as expression evaluation on top of that requires a few semantic actions operating a stack:

import Boo.Pegs
import System.Collections.Generic

stack = Stack[of int]()
push = stack.Push
pop = stack.Pop

peg:
    grammar = spaces, addition, eof
    addition = term, --("+", spaces, term, { push(pop() + pop()) })
    term = factor, --("*", spaces, factor, { push(pop() * pop()) })
    factor = ++[0-9], { push(int.Parse($text)) }, spaces
    spaces = --(' ' / '\t')
    eof = not any()
    
assert grammar.Match(PegContext("  6*6 + 6 "))
assert 42 == pop()

Semantic actions are just closures that get executed as matching succeeds. $text returns the text matched so far by the current rule.

Beautiful.

The underlying implementation based on a graph of expression objects really shines when one considers what it takes to extend the grammar above with support for hexadecimal literals:

peg:
    // rebind
    factor.Expression = hex_number / factor.Expression
    hex_number = "0x", ++hex_digit, { push(int.Parse($text[2:], NumberStyles.HexNumber)) }, spaces
    hex_digit = [0-9, a-f, A-F]
assert grammar.Match(PegContext("  0xa*2 + 11*0x02"))
assert 42 == pop()

It's not yet clear how this extensibility mechanism will be exposed at the boo language level but the simplicity at the peg level is encouraging.

One last feature worth pointing out is the ability to match based on a previously matched rule. For instance, the closing tag of a xml element must match the name in the starting tag:

import Boo.Pegs

peg:
    element = '<', tag, '>', content, '', @tag, '>'
    tag = ++(a-z)
    content = --(element / text)
    text = not "<", any()
    
assert element.Match(PegContext("<foo><bar>Hello</bar></foo>"))

I've found the idea for the last match operator @ first described in this article. Great idea.

I've been also greatly inspired by conversations I've had with Massi who's exploring similar territory and Jb during the last Mono Meeting in Barcelona Madrid and with Cedric over a beer in Paris. I think Massi will be pleased to know that I haven't given any thoughts to performance leaving all the fun to him.

Extensible parsing. Soon in a boo compiler close to you.