import scala.util.parsing.combinator1.Parsers import scala.util.parsing.input.CharArrayReader /* The goal is to provide combinators to parse any permutation of a given set of parsers, in a type-safe way and providing results in a consistent order. These two requirements are of course related, since type safety will require that results are in a predictable order. This is achieved. This implementation eliminates the shortcomings of my previous attempt: http://matt.immute.net/src/scala/PermParsers.scala The shortcomings of that approach are mentioned here: http://matt.immute.net/content/parsing-permutations-scala The previous implementation is a reasonably direct translation of the Haskell version in this paper: http://citeseer.ist.psu.edu/451549.html This version makes fundamental changes to make this more idiomatic in Scala. As a pleasant benefit, we also rely less on implicit conversions. I'm satisfied with this version. The only downside is that the presence of the ~ class in the combinator library clutters the implementation. It would be clearer if the parsers just used Tuple2. I'll show the sample code first, which matches a pattern and counts the number of occurrences of the numeral 3. Here is an example usage: [~/test]% scalac PermParsers2.scala [~/test]% scala Test '(1122233333)' [1.6] parsed: 5 [~/test]% scala Test '(23311111)' [1.6] parsed: 2 */ object Test extends Parsers with PermParsers { type Elem = Char def ones = (elem('1')+) def twos = (elem('2')+) def threes = (elem('3')+) // This is the important definition. Note that we don't need to provide // any explicit tupling function. Chains of ~ are constructed, just as // with the regular parser combinators. def countThrees = '(' ~> permute(ones ~~ twos ~~ threes) <~ ')' ^^ { case _ ~ _ ~ ts => ts.length } def parse(s: String) = println(countThrees(new CharArrayReader(s.toCharArray))) def main(args: Array[String]) = parse(args(0)) } /* Now the implementation. We again provide these combinators as an option that can be mixed into the standard Parsers trait. Note that efficiency is still not my goal. */ trait PermParsers { this: Parsers => /* * We'll need two base tuple injectors and a combinator. Left and right are * obvious. The third one is not so obvious. If we get rid of ~, its type is: * ((a,b) -> x) -> ((a,y),b) -> (x,y) * It's a simple function, so I feel like it must be known and named, but * I can't find a name for it. I just call it "roll" for lack of anything * clearer. If you know this function's name, please tell me! */ private def left[A,B](a: A, b: B) = new ~(a,b) private def right[B,A](b: B, a: A) = new ~(a,b) private def roll[A,B,C,D](f: (A,C) => D)(ab: ~[A,B], c: C) = new ~(f(ab._1, c), ab._2) /* Now for the permutation tree. Unlike the Haskell version, and the previous Scala version, the add operation (~~) is defined for all Perms. Therefore we just define it in OO style rather than as a separate function with pattern matching. */ sealed abstract class Perms[A] { /* When we add a parser B to a permutation tree, its type and location is reflected in the result type: Perms[A] becomes Perms[~[A,B]] */ def ~~[B](p: Parser[B]): Perms[~[A,B]] } /* Any parser can act as a one-element permutation. To add another element to a one-element permutation, we can choose to place it before or after the existing element. Either way, we must choose the appropriate tupling function. The type annotations are necessary. */ case class Leaf[A](a: Parser[A]) extends Perms[A] { def ~~[B](p: Parser[B]): Perms[~[A,B]] = Choice(List( Branch(this, p, left[A,B]), Branch(Leaf(p), a, right[B,A]))) } /* Next, define the type of branches in our tree. (Interior nodes will be lists of such branches.) A branch is composed of a subtree, the current parser, and a function which knows how to combine the results of these into the result. Note that the result type is opaque here. One might expect that it should be something like ~[A,B], but remember that these branches represent the actual permutations, so in general the result type is an arbitrary combination of components of A with B. */ private case class Branch[A,B,C](sub: Perms[A], p: Parser[B], f: (A,B) => C) /* Now the other case of Perms. A Choice is a list of branches where only the result type is known. This hides the actual structure of each branch (which will of course permute components of X and Y). The result type encodes all the types of the elements of the permutation. To add an element to a permutation, it's first added at the end and the resulting tuple is constructed as usual (via left). Next it's inserted recursively into each position in each subtree. For each element that we pass, we must roll the relevant tuple constructor. The type annotations are again necessary. */ private case class Choice[A,B](chs: List[Branch[X,Y,~[A,B]] forSome {type X; type Y}]) extends Perms[~[A,B]] { def ~~[C](p: Parser[C]): Perms[~[~[A,B],C]] = { val first = Branch(this, p, left[~[A,B],C]) def insert[X,Y](br: Branch[X,Y,~[A,B]]) = br match { case Branch(sub, py, f) => Branch(sub ~~ p, py, roll[X,C,Y,~[A,B]](f)) } val others = chs.map(insert(_)) Choice(first :: others) } } /* Finally convert a Perms to a Parser. This is pleasingly simple. */ def permute[A](ps: Perms[A]): Parser[A] = ps match { case Leaf(a) => a case Choice(chs) => val newchs = chs map { case Branch(sub, p, f) => (permute(sub) ~ p) ^^ {case a ~ b => f(a,b)} } (newchs :\ (failure("no valid permutation"):Parser[A])) (_|_) } /* The following definition provides a bit of syntactic sugar. Parsers will be implicitly promoted to one-element permutations. */ implicit def parserIsLeaf[A](p: Parser[A]): Perms[A] = Leaf(p) }