external DSLs with scala

Following up on my previous post on writing an internal SQL-like DSL for Scala I decided to bite the bullet and implement a ‘real’ parser for a subset of the SQL language to create the object tree from a SQL-like string. Since I didn’t have any experience in Parser/Combinators this proved to be quite an interesting exercise.

At this point the specs for the following queries succeed and produce valid results when I render SQL from the object graph:

[code]
select name from users order by name asc
select name from users where name = "peter"
select age from users where age = 30
select name from users where name = "peter" and age = 30
select name from users where age = 20 or age = 30
select name from users where name = "peter" and age = 20 or age = 30
select name,age from users where name = "peter" and (active = true or age = 30)
[/code]

The entire parser is just under 54 lines of code!

As you can see in the source code the Parser is basically a collection of small parsers which are combined into a parser for the entire language. Let’s have a look at the more basic ones:

The order clause
The order clause is probably the easiest part of the query. The production for the parser looks like this:

[code]
def order:Parser[Direction] = {
"order" ~> "by" ~> ident ~ ("asc" | "desc") ^^ {
case f ~ "asc" => Asc(f)
case f ~ "desc" => Desc(f)
}
}
[/code]

It matches a string which starts with order by, then an identifier and ends with asc or desc. The tilde (~) depicts a separater, and the greater then (>) after the tilde is used to drop the fields on the left side of the operator; just to have less repetition in the match algorithm to bind the value into objects which will form the AST.

Quite straightforward and easy to read.

Typed predicates
I had some more troubles implementing the typed predicates and still feel this could be written a bit more concise:
[code]
def predicate = (
ident ~ "=" ~ boolean ^^ { case f ~ "=" ~ b => BooleanEquals(f,b)}
| ident ~ "=" ~ stringLiteral ^^ { case f ~ "=" ~ v => StringEquals(f,stripQuotes(v))}
| ident ~ "=" ~ wholeNumber ^^ { case f ~ "=" ~ i => NumberEquals(f,i.toInt)}
)
[/code]

And/Or en parentheses
Implementing and/or in the where clause proved to be more difficult then I expected; mainly due to the precedence rules when using parentheses. I asked around on stackoverflow and read how this is done in this great article by Jim McBeath.

This resulted in the following solution:
[code]
def where:Parser[Where] = "where" ~> rep(clause) ^^ (Where(_:_*))

def clause:Parser[Clause] = (predicate|parens) * (
"and" ^^^ { (a:Clause, b:Clause) => And(a,b) } |
"or" ^^^ { (a:Clause, b:Clause) => Or(a,b) }
)

def parens:Parser[Clause] = "(" ~> clause <~ ")"
[/code]

Which effectively is a way of repeating predicates or clauses within parentheses interleaved with and/or. The parens production is just pre- and postfixing the clause with parens.

Optional parts
The select and order clause are optional, this is specified using the opt method:
[code]
def query:Parser[Query] = operation ~ from ~ opt(where) ~ opt(order) ^^ {
case operation ~ from ~ where ~ order => Query(operation, from, where, order)
}
[/code]

Since the Query object already accepts optional objects as where and order clause binding is straightforward.

Conclusion
Being fairly new to this game I had some trouble finding out how I was supposed to approach some of the problems I faced; but managed to overcome them quickly. There is a lot of documentation on various blogs and books. Writing a parser in Java, Groovy or Ruby would probably have taken me more time and would probably have resulted in far more code.

Now, off to look for some nails to test my new hammer on!

Full sources can be found in the Scala-SQL-DSL github repository.

This entry was posted in dsl, scala. Bookmark the permalink.

3 Responses to external DSLs with scala

  1. Ziet er best goed uit. Ik ken parser combinators uit Haskell en in Real World Haskell staat een hoofdstuk over het gebruik van Parsec, hoewel ik die library zelf nooit heb gebruikt: http://book.realworldhaskell.org/read/using-parsec.html

  2. Very nice.

    The link in your second paragraph to ‘specs’ has a trailing %22 which gives a 404.

    Channing

  3. peter says:

    @Channing thanks for pointing that out, corrected the link.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>