External CSV libraries

kantan.csv comes with a default implementation of CSV parsing and serializing. This implementation is relatively fast and robust, but might not satisfy all use cases - some of the more outlandish CSV mutations are not implemented (yet), for instance. For these cases, it’s possible to use other CSV libraries under the hood.

Supported libraries

Jackson CSV

The jackson csv parser and serializer can be used by adding the following dependency to your build.sbt:

libraryDependencies += "com.nrinaudo" %% "kantan.csv-jackson" % "0.7.0"

You then need to bring the right implicits in scope through:

import kantan.csv.engine.jackson._

You can tweak the behaviour of the underlying parsers and serializers by creating them through readerEngineFrom and writerEngineFrom.

Apache Commons CSV

The commons csv parser and serializer can be used by adding the following dependency to your build.sbt:

libraryDependencies += "com.nrinaudo" %% "kantan.csv-commons" % "0.7.0"

You then need to bring the right implicits in scope through:

import kantan.csv.engine.commons._

You can tweak the behaviour of the underlying parsers and serializers by creating them through readerEngineFrom and writerEngineFrom.

Supporting a new library

For the purpose of this tutorial, let’s make up an hypothetical CSV library, EasyCSV, that provides the following:

import java.io._

object EasyCSV {
  trait EasyWriter {
    def write(row: Array[String]): Unit
    def close(): Unit
  }

  def read(reader: Reader, sep: Char): java.util.Iterator[Array[String]] with Closeable = ???
  def write(writer: Writer, sep: Char): EasyWriter = ???
}

Parsing

Support for parsing with external libraries is handled through the ReaderEngine trait: all functions that need to create an instance of CsvReader rely on an implicit instance of ReaderEngine to do so.

Creating a new instance of ReaderEngine is meant to be fairly straightforward: there’s a helper ReaderEngine.from method that takes care of this. It still means we need to be able to write a (Reader, Char) => CsvReader[Seq[String]], but the most common scenario is already covered: if you can write a (Reader, Char) => Iterator[Seq[String]], you can simply use ResourceIterator.fromIterator:

import kantan.csv.engine._
import kantan.csv._

implicit val readerEngine: ReaderEngine = ReaderEngine.from { (in: Reader, conf: CsvConfiguration) =>
  kantan.codecs.resource.ResourceIterator.fromIterator(EasyCSV.read(in, conf.cellSeparator))
}

Serializing

Serializing is very similar to parsing, except that instead of providing a ReaderEngine, you need to provide a WriterEngine. This is achieved through WriterEngine.from, the argument to which you most likely want to create through CsvWriter.apply:

implicit val writerEngine: WriterEngine = WriterEngine.from { (writer: Writer, conf: CsvConfiguration) =>
  CsvWriter(EasyCSV.write(writer, conf.cellSeparator))(_ write _.toArray)(_.close())
}

Other tutorials: