Parsing permutations in Scala

A recent thread on the Scala list raises the question of parsing permutations using parser combinators. This is a situation where parser combinators offer a substantial improvement over EBNF. David Pollak proposes a solution, but it suffers from two problems:

  • we are only able to parse permutations where all elements have the same type.
  • the results are returned in input order, rather than declaration order, leaving us to figure out how to distinguish them.

Neither of these is disaster, but I hope it's possible to do better.

We have two goals:

  • allow permutations of parsers of any type, while preserving type safety without pattern matching or casts.
  • return results in a predictable order (the order specified in the program text), regardless of how they appear in the input.

A solution in Haskell is given by Baars, Löh and Swierstra in this paper, so a first try is to translate this into Scala. This is possible (although there are some points of interest), which results in a solution satisfying the requirements, but with one serious shortcoming. I don't know whether it's possible to perfect the solution, but I do hope that it is.

First let's see an example client session:

[~/test]% scalac PermParsers.scala 
[~/test]% scala CharParsers '(abc)'
[1.6] parsed: (a,b,c)
[~/test]% scala CharParsers '(bca)'
[1.6] parsed: (a,b,c)

With our solution, this is implemented by the code:

def phrase = '(' ~> permute(tup3 ~~ 'a' ~~ 'b' ~~ 'c') <~ ')'

with a suitable definition of tup3:

def tup3[A,B,C] = (a:A) => (b:B) => (c:C) => (a,b,c)

This definition is the major shortcoming. The Haskell solution relies crucially on fully curried functions. The requirement to provide an explicit handler function is present in the Haskell version, but it's considerably less onerous since Haskell's tuple constructors are already fully curried:

pPerms (,,) <<$>> pInt <<*>> pChar <<*>> pBool

As it's pretty unnatural in Scala to define fully curried functions, the rest of the parser combinator library uses the tuple-like case class ~ and several convenient definitions to apply multi-argument functions to flattened chains of ~ (see here). The ideal and more idiomatic solution would be to write:

def phrase = '(' ~> permute('a' ~~ 'b' ~~ 'c') <~ ')'

and assign the type Parser[~[~[Char,Char],Char]]. I hope this is possible, but I have not yet been able to accomplish it. In fact, my first few attempts ran into very fundamental problems. If I have more time, I'll give it another try, but meanwhile, you can examine the fully commented implementation. These combinators are implemented as a trait that can be used with Scala's standard Parsers trait:

object Test extends Parsers with PermParsers { ... }

The only difficulties in the implementation had to do with limitations of Scala's type inference (requiring some guesswork about where to put type annotations), and some difficulties with existential types. Existentials are poorly documented, and my first few attempts failed for reasons that I'm still not sure I understand. Scala's existential types desperately need a clear and precise explication.