Decoding rows as arbitrary types

Other tutorials covered decoding rows as collections, tuples and case classes. While those are the most common scenarios, it is sometimes necessary to decode rows into types that are none of these.

Let’s take the same example we did before:

val rawData: java.net.URL = getClass.getResource("/wikipedia.csv")

This is what this data looks like:

scala.io.Source.fromURL(rawData).mkString
// res0: String = """Year,Make,Model,Description,Price
// 1997,Ford,E350,"ac, abs, moon",3000.00
// 1999,Chevy,"Venture ""Extended Edition""","",4900.00
// 1999,Chevy,"Venture ""Extended Edition, Very Large""",,5000.00
// 1996,Jeep,Grand Cherokee,"MUST SELL!
// air, moon roof, loaded",4799.00"""

Now, let’s imagine we have a class that is not a case class into which we’d like to decode each row:

class Car(val year: Int, val make: String, val model: String, val desc: Option[String], val price: Float) {
  override def toString = s"Car($year, $make, $model, $desc, $price)"
}

Providing decoding support for our Car class is as simple as implementing an instance of RowDecoder for it and marking it as implicit.

There are various ways to implement an instance of RowDecoder, but by far the most idiomatic is to use one of the various helper methods defined in its companion object. For our current task, we need to decode a row into 5 values and stick them into Car’s constructor: we want ordered.

import kantan.csv._
import kantan.csv.ops._

implicit val carDecoder: RowDecoder[Car] = RowDecoder.ordered { (i: Int, ma: String, mo: String, d: Option[String], p: Float) =>
  new Car(i, ma, mo, d, p)
}

And we can now decode our data as usual:

rawData.asCsvReader[Car](rfc.withHeader).foreach(println _)
// Right(Car(1997, Ford, E350, Some(ac, abs, moon), 3000.0))
// Right(Car(1999, Chevy, Venture "Extended Edition", None, 4900.0))
// Right(Car(1999, Chevy, Venture "Extended Edition, Very Large", None, 5000.0))
// Right(Car(1996, Jeep, Grand Cherokee, Some(MUST SELL!
// air, moon roof, loaded), 4799.0))

The main reason this is the preferred solution is that it allows us never to have to think about individual cells in a row and how to decode them - we just have to describe what type we’re expecting and let kantan.csv deal with decoding for us.

Note that this case was fairly simple - the column and constructor parameters were in the same order. For more complex scenarios, where columns might be in a different order for example, decoder would be a better fit.

What to read next

If you want to learn more about: