3/n - How do I in FP... trying several input for the same function
In the previous entry of this series, I wrote about error recovery in a functional way. But sometimes, we want to try several inputs until we retrieve (optionally) a result. This is the case if you have a function which compute a location from a string (think geo-search), and which optionally returns a result, if a place matched. We may try this function on several input strings and only care in the first result. In this article, we will see how to do this in a functional way.
Problem setup
Given a function def computeLocation(str: String): Option[Location]
which tries to infer a location from a string and a list of input strings sorted by order of relevance (meaning that the first element may result in a more interesting location than the second and so on), let's retrieve the location (optionally). To sum it up:
// The computeLocation function:
def computeLocation(s: String): Option[Location] = // the implementation
// this call may generate a long List
val inputs: List[String] = createInputsFromForm(aForm)
How implement val result: Option[Location] = ???
?
From scratch
The first solution is to call the computeLocation
function for every string in the input list:
inputs.flatMap(computeLocation).headOption
This solution is the most simple ever. Let's break it into pieces to analyze it.
Let's start from the end. As we care only about the first meaningful result (which may not exist), we call headOption
.
Now, back to the beginning of the line, we are calling flatMap
on a List
. The proper signature of flatMap
in Scala standard library is slightly different from the one we are used to:
def flatMap[B](f: A => IterableOnce[B]): List[B] = //...
In the standard library, flatMap
expects a function producing an IterableOnce
. Fortunately Option
implements this trait. The definition commonly used for flatMap
is more like this one:
// Given that F is a type with one parameter
def flatMap[A, B](fa: F[A])(f: A => F[B]): F[B]
// For List this results in:
def flatMap[A, B](fa: List[A])(f: A => List[B]): List[B]
In the standard library, the trick with IterableOnce
spares us from writing something like:
inputs
.flatMap{ s =>
computeLocation(s).map(l => List(l)).getOrElse(Nil)
}
.headOption
In this code, we are manually transforming an Option
into a List
.
What's wrong with this implementation ? If you provide a dummy implementation for both computeLocation
and createInputsFromForm
, you will see that computeLocation
is called for every single item in the list. If we consider that computing the location is a costly operation, it is sad to waste resource computing, while we only care about the first meaningful answer. How to make the flatMap
sequence stop when we hit a result or the end of the list ?
Recursion by hand
If we break our problem in terms of imperative programming, we will easily come up with a while
statement (in pseudo code):
while no result and list has elements left
result = computeLocation with next element in the list
done
Let's encode this with recursion:
@scala.annotation.tailrec
def recur(next: List[String]): Option[Location] = {
next match {
case Nil => None
case head :: tail =>
val r = computeLocation(head)
if (r.isDefined) r else recur(tail)
}
}
recur(inputs)
What we do is just manually traverse the list recursively until computeLocation
gives a meaningful result. This is strictly the same as the imperative while
version, but in FP, we like to look further. We could for instance abstract over the computeLocation
and take as an extra parameter the function returning an Option
:
@scala.annotation.tailrec
def recur(next: List[String],
f: String => Option[Location]): Option[Location] = {
next match {
case Nil => None
case head :: tail =>
val r = f(head)
if (r.isDefined) r else recur(tail, f)
}
}
recur(inputs, computeLocation)
Looking at this code, do we really have to deal with String
and Location
? We do not use any of the properties of these types in recur
. Let's make them parameters.
@scala.annotation.tailrec
def recur[A, B](next: List[A], f: A => Option[B]): Option[B] = {
next match {
case Nil => None
case head :: tail =>
val r = f(head)
if (r.isDefined) r else recur(tail, f)
}
}
recur(inputs, computeLocation)
Following this reasoning we could go even further and add some mechanism to also deal with something else than Option
and List
. In fact, cats has such functions to hide recursion from you. It is called tailRecM
but as it deserves an entire post for itself, I won't talk about it here.
Back with the Scala standard library...
In the standard library, we can also find the collectFirst
function which, well do pretty much what we need !
inputs.collectFirst { str =>
computeLocation(str) match {
case Some(r) => r
}
}
Well, the careful reader would have noticed that the function we pass to collectFirst
is not total: it does not produce an output for every input.
As computeLocation
is total, then it is a bit sad to write a non-total version of it... The standard library also provides a function to transform a total function returning Option
to a partial one: def unlift[T, R](f: T => Option[R]): PartialFunction[T, R]
. So we can now write:
inputs.collectFirst(Function.unlift(computeLocation))
If you are using cats, you will find collectFirstSome
which will make this possible:
inputs.collectFistSome(computeLocation)
What if computeLocation
perform other effects ?
Let's change the problem a bit now. What if the computeLocation
function performs some network calls, or read a database to find a result ? These things are known to fail at one point or another. How to deal with that ?
Without adding too much complexity, let's consider that computeLocation
has a more expressive output: Either[Throwable, Option[Location]]
. This type means that computeLocation
may fail with a Throwable
or give a result if it was able to compute one (the absence of result being completely normal). The complete signature is now:
def computeLocation(rawStr: String): Either[Throwable, Option[Location]]
As you can see, the Option
result is now boxed into Either
. This makes both collectFirst
and collectFirstSome
impossible to use, as the type do not match now. In FP, this kind of boxing appears frequently, so that authors of functional libraries like cats already wrote functions to deal with this. In cats, you will often see functions whose name ends with a capital M
. In our case, it is easy to find collectFirstSomeM
which has this signature:
def collectFirstSomeM[G[_], B](f: A => G[Option[B]])
(implicit F: Foldable[F], G: Monad[G]): G[Option[B]]
As you can see, this definition is abstract. Note that this function is part of a class parametrized on F
, where F
is itself a type with a "hole" : F[_]
. In our example, this F
will be Either[Throwable,_]
(ie Either
with only one type parameter applied). To make it easier, here is a version (pseudo scala) where abstract types are replaced with the one we use in our example (without the implicit part):
def collectFirstSomeM[Either[Throwable,_], Location](f: String => IO[Option[Location]]): Either[Throwable,Option[Location]]
Look how computeLocation
has exactly the expected signature for the f
parameter ! What about the implicits parameters now ? In this case, we can read the definition as: "the collectFirstSomeM
method is available if and only if for the F
type we can implement the trait Foldable
and for the G
type the Monad
trait". In our case, with List
and Either
it is already done for us in cats so we can do:
import cats.implicits._
val inputs: List[String] = computeInputs(param)
val result: Either[Throwable, Option[Location]] = inputs.collectFirstSomeM(computeLocation)
Yet, if you write a simple sample for this, like :
def computeLocation(rawStr: String): Either[Throwable, Option[Location]] = {
println(s"Computing with $rawStr") // to see something
// Fake implementation
if (rawStr.length == 1) Left(new IllegalStateException("wrong state"))
else if (rawStr.length % 4 == 0) Right(Some(Location(2.0, 4.0, "Plop")))
else Right(None)
}
val inputs = List("a", "impair?", "plop", "other")
import cats.implicits._
println(
inputs.collectFirstSomeM(computeLocation)
)
You will notice that the output will be:
Computing with a
Left(java.lang.IllegalStateException: wrong state)
The execution is stopping as soon as a Left
is returned ! This is perfectly normal as Either
is a fail fast structure. And the requirement for Monad
should have caught our attention: Monad
express sequence, the next operation using the result of the previous one as input parameter.
So, as the caller of computeLocation
in our situation, we have two options:
- consider a
Left
as a blocker, and indeed, keep it, - try to recover from it by composing
computeLocation
with a function handling the situation with care.
Here is an example of the second choice, being ignoring errors (you probably don't want to do that, and perform a selective recovery, composing with things we saw earlier):
inputs.collectFirstSomeM(
s => computeLocation(s).recoverWith(_ => Either.right(None))
)
Conclusion
In this article, we saw how to deal with something simple: calling a function for elements in a list and keeping the first meaningful result (if any). Even if it was simple, what I would like to highlight is how this solution is composable. Every type in the design models an intent:
Option
models that the result may be irrelevant or presentEither
models that a computation may failFoldable
express that we want to build a single value from an initial structure (in our case, we fold the list to a single result)Monad
express that we compute in sequence: indeed, there is no parallelism involved in our solution, we try the input strings one after the other.
Yet, the work here is not over. Next time, we will see how we can adapt this example to make it less "sequential".