noodling towards a functional brain

Thursday, July 29, 2010

A monad for failure-tolerant computations.

I was working on a problem today where some of the computations could fail, but degrade gracefully while providing information about how exactly they failed so that clients could choose whether or not to use the result. This is what I came up with:
/**
 * A monad used to entrain computations on a value that might fail.
 * This is distinct from Either in that it allows computations to continue
 * in the presence of errors, with the possibility of a degraded result.
 * The user of the result can then decide its suitability for use based upon
 * any errors that might be returned. Errors must form a semigroup.
 */
sealed trait Attempt[E, +A] {
  def value: A
  def map[B](f: A => B): Attempt[E, B]
  def flatMap[B](f: A => Attempt[E, B]): Attempt[E, B]

  def error: Option[E]
  def either: Either[E, A]
}

case class Success[E, +A](value: A) extends Attempt[E, A] {
  def map[B](f: A => B): Attempt[E, B] = Success(f(value))
  def flatMap[B](f: A => Attempt[E, B]): Attempt[E, B] = f(value)
  def error = None
  def either = Right(value)
}

case class Failure[E, +A](err: E, value: A)(implicit append: (E, E) => E) extends Attempt[E, A] {
  def map[B](f: A => B): Attempt[E, B] = Failure(err, f(value))
  def flatMap[B](f: A => Attempt[E, B]): Attempt[E, B] = f(value) match {
    case Failure(e, b) => Failure(append(e, err), b)
    case Success(b)   => Failure(err, b)
  }

  def error = Some(err)
  def either = Left(err)
}
Pretty trivial, but maybe occasionally useful. Here's how it looks in my code:
  def mergeComponents(f: String => JSONReport): Attempt[List[String], JSONReport] = {
    components.map(f).map(_.mergeComponents(f)).foldLeft[Attempt[List[String], JSONReport]](Success(this)) {
      (result, comp) => for {
        cur <- result
        jr  <- comp
        properties     <- mergeProperties(cur, jr)
        queries        <- mergeQueries(cur, jr)
        dataAliases    <- mergeAliases(cur, jr, queries.values)
        dataTransforms <- mergeTransforms(cur, jr, dataAliases.values)
        dataMailers    <- mergeMailers(cur, jr, dataTransforms.values)
      } yield {
        JSONReport(
          cur.reportId, cur.active, cur.version, properties,
          queries, dataAliases, dataTransforms, dataMailers,
          cur.dataRange, Nil
        )
      }
    }

Here each of the various mergeX methods return an Attempt[List[String], X] where X is something needed to build the ultimate report. Here I'm just aggregating lists of strings describing the errors, but of course any type for which (E, E) => E is defined would work.

Attempt. For all those times where you want a monad that says, "Hey, I maybe couldn't get exactly what you asked for. Maybe it's little broken, maybe it won't work right, but this the best I could do. You decide."

EDIT:

After a bit of thinking, I realized that this monad is really more general than being simply related to success or failure - it simply models a function that may or may not produce some additional metadata about its result. Then a lightbulb went off, and quick google search confirmed... yup, I just reinvented the writer monad. It's not *exactly* like Writer, because it just requires a semigroup for E instead of a monoid, and the presence of a "log" is optional, so maybe it's better suited than Writer for a few instances.

The one really nice thing about rediscovering well-known concepts is that doing the derivation for yourself, in the context of some real problem, gives you a much more complete understanding of where the thing you reintevented is applicable!

Inspired by michid's example in the comments below, here's a simpler example that demonstrates some utility his doesn't quite capture.

implicit val append = (l1: List[String], l2: List[String]) => l1 ::: l2

def succ(s: String) = Success[List[String], String]("result: success")
val fail: String => Attempt[List[String], String] = { 
  var count = 0 
  (s: String) => {
    count += 1
    Failure(List("failure " + count), s + " then failure")
  }
}

val r = for {
  x <- succ("here we go!")
  y <- fail(x)
  z <- fail(y)
} yield z

println(r)
This results in:
Failure(List(failure 2, failure 1),result: success then failure then failure)

Thursday, July 15, 2010

An unfortunate situation.

I've been working a bit with Hadoop lately. As many will know, there are essentially two versions of Hadoop that coexist within the Hadoop API; a deprecated legacy API and a newer one that is not nearly so feature-complete. It's a source of confusion to say the least.

Earlier today, I asked in the #hadoop channel on freenode, why were the breaking changes not simply made on a separate branch, the major version number incremented, and so forth such that everyone could move forward? The answers that I got back illuminated a troublesome trend that I think is worth considering for those projects (like my beloved Scala) that are moving towards professional support models of funding.

During the discussion, I commented that there is an unfortunate reality that the largest companies, who may provide the greatest support any given open-source project are also those who will likely be most conservative with respect to change. So as soon as open-source projects grow large enough that companies selling commercial support for those projects survive, there is an immediate disincentive for the support companies to make breaking changes. And stagnation ensues.

The response was startling. People who had been involved with numerous prominent businesses and projects immediately replied with stories of their own projects suffering such a fate, sometimes to the point that the project died entirely.

I think that this "success effect" can be traced to a single, simple cause, with a single, simple solution. Companies (and the individuals that make them up) need to stop thinking of mission-critical software as something that is ever complete. Marketing is never complete; sales is never done with their job, purchasing and HR are never done, nor is accounting. Software is no different - as engineers, we do not (and should not purport to) produce solutions to the problems that companies have; instead, we automate and support the operation of businesses so that businesses can do more, serve more customers, and in the end create more wealth. A project is finished when nobody is using it; never before then.

Friday, February 05, 2010

Trick of the day

James Iry just mentioned something I'd never noticed before in #scala:
scala> def f(i: Int, x: Int) = i match { case `x` => println("x!"); case _ => println("nope") }
f: (i: Int,x: Int)Unit

scala> f(1, 1)
x!

scala> f(1, 2)
nope
That's one of those things that I'd idly wondered how to do, but hadn't had a use case so hadn't really gone looking. Good to know it's possible and simple... and more or less makes sense. More or less. Actually, I guess I'm kind of glad that the following doesn't compile:
scala> def f(i: Int) = i match { case `yield` => println(`yield`) }                           
<console>:4: error: not found: value yield
       def f(i: Int) = i match { case `yield` => println(`yield`) }
I can live with the restriction as not being able to use a reserved word for a variable on the lhs of a pattern match, though it does make for a weird inconsistency:
scala> val (`yield`, x) = (1, 1)
<console>:4: error: not found: value yield
       val (`yield`, x) = (1, 1)
            ^

scala> val `yield` = 1
yield: Int = 1
Chalk up one more to the box of "pain points caused by using pattern matching for parallel assignment."

Wednesday, February 03, 2010

On Extraction

Scala case classes are great; they give you a tremendous leg up when it comes to implementing pattern matching. However, there's a drawback: the unapply method of case class companion objects take arguments of the type of the case class, and return Option[TupleN[...]] where N is the number of variables to the case class constructor. What's more, it's not possible to overload or override the unapply method to deconstruct other types of by creating your own companion object on the side, as shown in this REPL session:
scala> case class  Foo(val i: Int, val s: String)
defined class Foo

scala> object Foo {
     |   def unapply(s: String): Option[Foo] = Some(new Foo(s.substring(0, 2).toInt, s.substring(2)))
     | }
defined module Foo

scala> "12abcd" match {
     |   case Foo(i, s) => println(s, i)
     | }
<console>:10: error: wrong number of arguments for object Foo
         case Foo(i, s) => println(s, i)
              ^
<console>:10: error: wrong number of arguments for method println: (Any)Unit
         case Foo(i, s) => println(s, i)
Sadly, it doesn't even seem to be possible to move this extractor to another class and get the desired result:
scala> object SFoo {                                                                                 
     |   def unapply(s: String): Option[Foo] = Some(new Foo(s.substring(0, 2).toInt, s.substring(2)))
     | }
defined module SFoo

scala> "12abcd" match {                 
     |   case SFoo(i, s) => println(s, i)
     | }
<console>:9: error: wrong number of arguments for object SFoo
         case SFoo(i, s) => println(s, i)
              ^
<console>:9: error: wrong number of arguments for method println: (Any)Unit
         case SFoo(i, s) => println(s, i)

Fortunately, there is a way around this issue: simply define your own class that extends ProductN! Then, you can define as many extractors as you want in different pseudo-companion objects and use them idiomatically:
scala> class Foo(val i: Int, val s: String) extends Product2[Int, String] {
     |   val (_1, _2) = (i, s)
     | }
defined class Foo

scala> object Foo {
     |   def unapply(f: Foo): Option[(Int, String)] = Some((f.i, f.s))
     | }
defined module Foo

scala> object SFoo {
     |   def unapply(s: String): Option[Foo] = Some(new Foo(s.substring(0, 2).toInt, s.substring(2)))
     | }
defined module SFoo

scala> "12abcd" match {
     |   case SFoo(i, s) => println(s, i)
     | }
(abcd,12)

scala> new Foo(1, "hi") match {
     |   case Foo(i, s) => println(s, i)
     | }
(hi,1) 

Also, naturally, the trivial extractor (Foo => Foo) can supplant the Foo => Tuple2 extractor above:
scala> object Foo {
     |   def unapply(f: Foo): Option[Foo] = Some(f)
     | }

scala> new Foo(1, "hi") match {                  
     |   case Foo(i, s) => println(s, i)         
     | }                                
(hi,1)

About Me

My photo
aspiring to elegant simplicity