import scala.util.parsing.combinator1.Parsers
import scala.util.parsing.input.CharArrayReader
/*
The goal is to provide combinators to parse any permutation of a given
set of parsers, in a type-safe way and providing results in a consistent
order. These two requirements are of course related, since type safety
will require that results are in a predictable order. This is achieved.
This implementation eliminates the shortcomings of my previous attempt:
http://matt.immute.net/src/scala/PermParsers.scala
The shortcomings of that approach are mentioned here:
http://matt.immute.net/content/parsing-permutations-scala
The previous implementation is a reasonably direct translation of the Haskell
version in this paper:
http://citeseer.ist.psu.edu/451549.html
This version makes fundamental changes to make this more idiomatic in Scala.
As a pleasant benefit, we also rely less on implicit conversions.
I'm satisfied with this version. The only downside is that the presence of
the ~ class in the combinator library clutters the implementation. It would
be clearer if the parsers just used Tuple2.
I'll show the sample code first, which matches a pattern and counts the
number of occurrences of the numeral 3. Here is an example usage:
[~/test]% scalac PermParsers2.scala
[~/test]% scala Test '(1122233333)'
[1.6] parsed: 5
[~/test]% scala Test '(23311111)'
[1.6] parsed: 2
*/
object Test extends Parsers with PermParsers {
type Elem = Char
def ones = (elem('1')+)
def twos = (elem('2')+)
def threes = (elem('3')+)
// This is the important definition. Note that we don't need to provide
// any explicit tupling function. Chains of ~ are constructed, just as
// with the regular parser combinators.
def countThrees = '(' ~> permute(ones ~~ twos ~~ threes) <~ ')' ^^
{ case _ ~ _ ~ ts => ts.length }
def parse(s: String) =
println(countThrees(new CharArrayReader(s.toCharArray)))
def main(args: Array[String]) = parse(args(0))
}
/*
Now the implementation. We again provide these combinators as an option that
can be mixed into the standard Parsers trait.
Note that efficiency is still not my goal.
*/
trait PermParsers { this: Parsers =>
/*
* We'll need two base tuple injectors and a combinator. Left and right are
* obvious. The third one is not so obvious. If we get rid of ~, its type is:
* ((a,b) -> x) -> ((a,y),b) -> (x,y)
* It's a simple function, so I feel like it must be known and named, but
* I can't find a name for it. I just call it "roll" for lack of anything
* clearer. If you know this function's name, please tell me!
*/
private def left[A,B](a: A, b: B) = new ~(a,b)
private def right[B,A](b: B, a: A) = new ~(a,b)
private def roll[A,B,C,D](f: (A,C) => D)(ab: ~[A,B], c: C) =
new ~(f(ab._1, c), ab._2)
/*
Now for the permutation tree. Unlike the Haskell version, and the
previous Scala version, the add operation (~~) is defined for all
Perms. Therefore we just define it in OO style rather than as a
separate function with pattern matching.
*/
sealed abstract class Perms[A] {
/*
When we add a parser B to a permutation tree, its type and location
is reflected in the result type: Perms[A] becomes Perms[~[A,B]]
*/
def ~~[B](p: Parser[B]): Perms[~[A,B]]
}
/*
Any parser can act as a one-element permutation. To add another element
to a one-element permutation, we can choose to place it before or after
the existing element. Either way, we must choose the appropriate tupling
function. The type annotations are necessary.
*/
case class Leaf[A](a: Parser[A]) extends Perms[A] {
def ~~[B](p: Parser[B]): Perms[~[A,B]] =
Choice(List(
Branch(this, p, left[A,B]),
Branch(Leaf(p), a, right[B,A])))
}
/*
Next, define the type of branches in our tree. (Interior nodes will
be lists of such branches.) A branch is composed of a subtree, the current
parser, and a function which knows how to combine the results of
these into the result. Note that the result type is opaque here. One
might expect that it should be something like ~[A,B], but remember
that these branches represent the actual permutations, so in general
the result type is an arbitrary combination of components of A with B.
*/
private case class Branch[A,B,C](sub: Perms[A], p: Parser[B], f: (A,B) => C)
/*
Now the other case of Perms. A Choice is a list of branches where only
the result type is known. This hides the actual structure of each branch
(which will of course permute components of X and Y). The result type
encodes all the types of the elements of the permutation. To add an element
to a permutation, it's first added at the end and the resulting tuple is
constructed as usual (via left). Next it's inserted recursively into each
position in each subtree. For each element that we pass, we must roll the
relevant tuple constructor. The type annotations are again necessary.
*/
private case class Choice[A,B](chs:
List[Branch[X,Y,~[A,B]] forSome {type X; type Y}])
extends Perms[~[A,B]] {
def ~~[C](p: Parser[C]): Perms[~[~[A,B],C]] = {
val first = Branch(this, p, left[~[A,B],C])
def insert[X,Y](br: Branch[X,Y,~[A,B]]) = br match {
case Branch(sub, py, f) =>
Branch(sub ~~ p, py, roll[X,C,Y,~[A,B]](f))
}
val others = chs.map(insert(_))
Choice(first :: others)
}
}
/*
Finally convert a Perms to a Parser. This is pleasingly simple.
*/
def permute[A](ps: Perms[A]): Parser[A] = ps match {
case Leaf(a) => a
case Choice(chs) =>
val newchs = chs map {
case Branch(sub, p, f) => (permute(sub) ~ p) ^^ {case a ~ b => f(a,b)}
}
(newchs :\ (failure("no valid permutation"):Parser[A])) (_|_)
}
/*
The following definition provides a bit of syntactic sugar. Parsers will
be implicitly promoted to one-element permutations.
*/
implicit def parserIsLeaf[A](p: Parser[A]): Perms[A] = Leaf(p)
}