Decoding cells as arbitrary types

We’ve seen in a previous post how to decode CSV rows as collections. Exactly how that happened and how individual cells were turned into useful types was sort of glossed over, though. In this tutorial, we’ll take a deeper look at the underlying mechanism.

General mechanism

Cell decoding is type class based: kantan.csv knows how to turn a CSV cell into a type A, provided that there is an implicit instance of CellDecoder[A] in scope. All sane primitive types have a default implementation in scope - Int, for example:

implicitly[kantan.csv.CellDecoder[Int]]
// res0: kantan.csv.package.CellDecoder[Int] = kantan.codecs.Codec$$anon$1@90ec82a

A more complete list of default instances can be found here.

And so, when asCsvReader or readCsv are asked to turn a row into a List of elements A, they look for a corresponding implicit CellDecoder[A] and rely on it for decoding:

import kantan.csv._
import kantan.csv.ops._

"1,2,3\n4,5,6".readCsv[List, List[Int]](rfc)
// res1: List[ReadResult[List[Int]]] = List(
//   Right(value = List(1, 2, 3)),
//   Right(value = List(4, 5, 6))
// )

Adding support to new types

In order to add support to non-standard types, all you need to do is implement an implicit CellDecoder instance for that type. Let’s do so, for example, for Joda DateTime:

import kantan.csv._
import org.joda.time.DateTime
import org.joda.time.format.ISODateTimeFormat

implicit val jodaDateTime: CellDecoder[DateTime] = {
  val format = ISODateTimeFormat.date()
  CellDecoder.from(s => DecodeResult(format.parseDateTime(s)))
}

And we can now decode CSV data composed of dates:

"2009-01-06,2009-01-07\n2009-01-08,2009-01-09".asCsvReader[List[DateTime]](rfc).foreach(println _)
// Right(List(2009-01-06T00:00:00.000+01:00, 2009-01-07T00:00:00.000+01:00))
// Right(List(2009-01-08T00:00:00.000+01:00, 2009-01-09T00:00:00.000+01:00))

What to read next

If you want to learn more about: