Either Cat or Scary Cat

The cats project for scala is wonderful. I love how it has opened up the more advanced functional programming concepts in scala, and I really love the effort they have gone to with the documentation. The whole thing is a joy to use.

It’s easy to get carried away with abstractions in scala, especially if, like me, you come from java or C# and you one day come to the realisation that the scala compiler can do so much more. There’s a type here and a type of types there and you start seeing these patterns and mapping and flatMapping over things which your past self would never have dreamed of mapping over. It’s great.

Anyway, we get carried away sometimes, I get the feeling there’s some tension in the air over scala becoming too haskelly, or not haskelly enough, and we and our poor colleagues never get a chance to breathe and settle in to the world of scala.

Anyway. EitherT is a data type offered by cats, and despite the great documentation, I don’t think the EitherT docs do a great job explaining why something like this exists. I’ve tried to come up with some examples to get the point across.

To sum up (spoiler alert), EitherT is really great - but I’m not 100% convinced. There are boilerplate reductions to be made, but there is a hidden cost to a lot of these abstract FP concepts, they take so much brain power to get your head around. There’s a balance to strike somewhere between the dirty java style simplicity, and elegant FP simplicity, which scala is right in the middle of. But if none of that bothers you, then by all means, abstract away.

Future of Either

Imagine a function. This function calls a service. The service is maintained by another team so it’s really slow, and it fails all the time, sometimes because we supply invalid input, but mostly because the other team doesn’t look after their servers properly.

We return a Future to help with the slowness, wrapping an Either to deal with errors.

I’ll come up with a contrived example, but this is the sort of code you’re likely to come across sooner or later in your scala career.

// We need to import the Future class,
// and bring an execution context into implicit scope.
import scala.concurrent.Future
implicit val ec = scala.concurrent.ExecutionContext.global

// We will be modelling a bunch of functions which return something of type
// Future[Either[Error, A]]. I don't want to keep writing it so I'll use a 
// type alias with a very broad Error trait, leaving A generic.
trait Error
type Result[A] = Future[Either[Error, A]]

def init: Result[String] = slowAndUncertain("init")

// For convenience, a method to wait for our futures to finish.
// I don't care how long it takes.
import scala.concurrent.Await
import scala.concurrent.duration.Duration
def await[A](f: Future[A]) = Await.result(f, Duration.Inf): A

println(await(init))

I won’t bother to implement slowAndUncertain here, but I’ll provide the source code for a fully functioning, self contained SBT demonstration along with the source code for this blog here. We just need to know that it takes a String and returns a Result[String].

Nothing about this looks particularly scary, but what happens when we have more calls to make, with dependencies between them?

I’ll try to make this example less contrived. Let’s say we’re writing an API which finds all the pubs close to a user. First we ask for the user’s ID. Then we authenticate the user by calling an authentication service. Then we get the user’s location, and finally get the list of nearby pubs.

If we get an error somewhere along the chain, we shouldn’t make any more calls, and instead just return the error. This is a really great mindset for functional error handling, check out ScottW’s amazing Railway Oriented Programming for more details.

// Define data types
case class UserId(s: String)
case class AuthenticatedUser(id: UserId)
case class UserLocation(user: AuthenticatedUser)
case class Pub()

//  Mock up service calls
def askForId:                             Result[UserId] = ???
def authenticate(userId: UserId):         Result[AuthenticatedUser] = ???
def getLocation(user: AuthenticatedUser): Result[UserLocation] = ???
def findPubs(userLocation: UserLocation): Result[List[Pub]] = ???

// Get the list of pubs, by flatMapping each result in succession.
def pubs: Result[List[Pub]] = askForId.flatMap {
  case Left(e) => Future.successful(Left(e))
  case Right(r) => authenticate(r).flatMap {
    case Left(e) => Future.successful(Left(e))
    case Right(r) => getLocation(r).flatMap {
      case Left(e) => Future.successful(Left(e))
      case Right(r) => findPubs(r)
    }
  }
}

println(await(pubs))

Apart from the arrows all lining up, this code is a bit ugly. It’s hard to look at it and know exactly what is happening. We have a lot of nested anonymous partial functions and repeated code. I’ve taken a few liberties here, as you are no doubt aware, production code is not going to be as simple as this, it will be full of log entries and bits and pieces sticking out all over the place, but you get the picture (the arrows never line up in production).

I’ll assume you’re familiar with what the flatMap calls are doing, but to address some scala notation which bothers me slightly, the syntax x.flatMap { case y => ... } is shorthand for x.flatMap(r => r match { case y => ... }). I’m not a huge fan of this shortcut because I think it encourages code like the above, rather than breaking things down into functions. It only annoys me when it’s stacked a few layers deep like that.

Let’s see what we can do to refactor this into something more manageable. We have a common pattern which we use on each result - we check if the result is a success or a failure. On failure, we return the error untouched, bundled back up in a future of either. On success we pass our result to the next function. We can write a function to describe this behaviour.

// Curried funtion, which we can provide an onSuccess function,
// which will return a function to use inside our flatMap calls.
def handle[A,B](onSuccess: A => Result[B])
  (in: Either[Error, A]): Result[B] = in match {
  case Left(e) => Future.successful(Left(e))
  case Right(x) => onSuccess(x)
}

// This allows to write the entire chain as nested flatMap calls
def betterPubs: Result[List[Pub]] = askForId
  .flatMap(handle(authenticate(_)
    .flatMap(handle(getLocation(_)
      .flatMap(handle(findPubs))))))

println(await(betterPubs))

That’s a bit better, I think…?

As a quick aside, I don’t know what it is about scala’s for-comprehensions which gives me so much trouble, but I rarely use them in favour of explicit function calls (I’m the same about haskell’s do notation). Maybe I just need more practice. I love this example because the for-comprehension is vastly easier to understand. Have a look, it’s a stark contrast:

def evenBetterPubs: Result[List[Pub]] = for {
  id       <- askForId
  user     <- handle(authenticate)(id)
  location <- handle(getLocation)(user)
  pubs     <- handle(findPubs)(location)
} yield pubs

println(await(evenBetterPubs))

Much better! I’ve broken the problem up into functions, the functions all have names and we have a flat structure. The compiler is doing all the hard work for us.

Does it get any better than this?

EitherT of Future

EitherT inverts the order of our Future[Either[A,B]] into EitherT[Future,A,B]. I’ll be working with another type alias to save space - just remember that whenever you see a ResulT[A], it can be substituted with EitherT[Future, Error, A]. Our existing methods can be wrapped in EitherT.apply to transform the returned Result into a ResulT - it’s really no effort at all.

There’s just one more thing - I made it all this way without mentioning the m word, but there’s one extra line needed to let cats do it’s magic. We have to import a typeclass instance for Future. I won’t go great detail, just know that the instance we take from cats.instances.future._ allows cats to do work on the futures. This one line encompasses all of my whining in the intro about brain power - don’t get me wrong, typeclasses are great and super powerful, but using them in scala has never felt natural to me, I’m can’t pinpoint exactly why yet. Anyway, I’ll leave it for now.

import cats.data.EitherT

// I'll use another type alias to save space. Sorry about the name.
type ResulT[A] = EitherT[Future, Error, A]

// It's trivial to transform our old methods to return EitherT[Future].
def askForIdT(): ResulT[UserId]
  = EitherT(askForId)
def authenticateT(userId: UserId): ResulT[AuthenticatedUser]
  = EitherT(authenticate(userId))
def getLocationT(user: AuthenticatedUser): ResulT[UserLocation]
  = EitherT(getLocation(user))
def findPubsT(userLocation: UserLocation): ResulT[List[Pub]]
  = EitherT(findPubs(userLocation))

// We need an instance of the Monad typeclass for Future.
import cats.instances.future._

def bestPubs: ResulT[List[Pub]] = for {
  id       <- askForIdT
  user     <- authenticateT(id)
  location <- getLocationT(user)
  pubs     <- findPubsT(location)
} yield pubs

println(await(bestPubs.value))

And that’s really all there is to it. No added boilerplate! I don’t have to write my own custom handler, and I get an easy flat structure over my methods, right out of the box.

There’s less complexity here, and therefore less bug generating capacity, which is great. The overhead comes when you want to understand how it works - for that, I highly recommend that you head to the cats documentation and give it a try.

I bet there are a bunch of other benefits to do with composability and reusability with other types, but I haven’t delved deep enough yet. Thanks for reading.