This is meant as guidelines for implementing codec libraries based on kantan.codecs. Since it’s essentially meant as a quick reminder to myself, it’s probably too terse and not terribly understandable to anyone else. I’m fairly certain no one but me will ever use kantan.codecs directly, but should I be wrong, feel free to create an issue asking for more detailed guidelines.
Errors should be represented as sum types, and usually provide one alternative that serves as a wrapper for
Throwable
. By convention, the sum type should be called DecodeError
.
For example:
sealed abstract class DecodeError extends Product with Serializable
case class TypeError(cause: Throwable) extends DecodeError
case class OutOfBounds(index: Int) extends DecodeError
// Declares creation methods - TypeError(exception) is of type TypeError, DecodeError.typeError(exception) is of
// type DecodeError.
object DecodeError {
def typeError(cause: Throwable): DecodeError = TypeError(cause)
def outOfBounds(index: Int): DecodeError = OutOfBounds(index)
}
Decoder
instances will be specialised on that error type: they will return instances of Either[DecodeError, A]
.
It’s good form to “hide” Either
as much as possible though, and a type alias should be declared:
import kantan.codecs._
type DecodeResult[A] = Either[DecodeError, A]
Additionally, a singleton object for the specialised result type should be created to provide instance creation methods:
object DecodeResult {
def apply[A](a: => A): DecodeResult[A] = ResultCompanion.nonFatal(TypeError.apply)(a)
def success[A](a: A): DecodeResult[A] = Right(a)
def outOfBounds(index: Int): DecodeResult[Nothing] = Left(OutOfBounds(index))
}
Encoder
, Decoder
, Codec
Encoder
, Decoder
and Codec
require a tag type - a phantom type use to disambiguate between implementations
that work with the same encoded type. This is traditionally a singleton object called codecs
.
// We'll need to redefine that later to add default instances.
object codecs
Specialised types should be declared as type aliases:
type CellDecoder[A] = Decoder[String, A, DecodeError, codecs.type]
type CellEncoder[A] = Encoder[String, A, codecs.type]
type CellCodec[A] = Codec[String, A, DecodeError, codecs.type]
Each specialised type should have a singleton object, used to declare creation and summoning methods:
object CellDecoder {
def apply[A](implicit da: CellDecoder[A]): CellDecoder[A] = da
def from[A](f: String => DecodeResult[A]): CellDecoder[A] = Decoder.from(f)
}
object CellEncoder {
def apply[A](implicit ea: CellEncoder[A]): CellEncoder[A] = ea
def from[A](f: A => String): CellEncoder[A] = Encoder.from(f)
}
object CellCodec {
def apply[A](implicit ca: CellCodec[A]): CellCodec[A] = ca
def from[A](f: String => DecodeResult[A])(g: A => String): CellCodec[A] = Codec.from(f)(g)
}
Decoder
and Encoder
implementations should have accompanying “instances” trait containing all default instances.
trait CellDecoderInstances {
val stringDecoder: CellDecoder[String] = CellDecoder.from(DecodeResult.success)
// ...
}
trait CellEncoderInstances {
val stringEncoder: CellEncoder[String] = CellEncoder.from(identity)
// ...
}
Codec
implementations should have an accompanying “instances” trait that extends the previously defined ones:
trait CellCodecInstances extends CellDecoderInstances with CellEncoderInstances
Finally, in order for all these instances to be available in the implicit scope, codecs
should be modified to extend
CellCodecInstances
.
object codecs extends CellCodecInstances
Decoding from strings is a fairly common requirement, regardless of the underlying format. Default codecs are provided for these, and can be adapted trivially:
import kantan.codecs.strings.{StringEncoder, StringDecoder}
def fromStringDecoder[A](implicit da: StringDecoder[A]): CellDecoder[A] =
da.leftMap(DecodeError.typeError).tag[codecs.type]
def fromStringEncoder[A](implicit ea: StringEncoder[A]): CellEncoder[A] = ea.tag[codecs.type]
There’s a critical distinction to be made between default instances for primitive types and for first-order types:
it’s safe to provide a Codec
for the former, but usually not for the later.
In order to write default instances for Option
, one might be tempted to write:
def optionCodec[A](implicit ca: CellCodec[A]): CellCodec[Option[A]] = CellCodec.from { s =>
if(s.isEmpty) DecodeResult.success(Option.empty[A])
else ca.decode(s).map(Some.apply)
} { _.map(ca.encode).getOrElse("") }
This is bad, for two reasons. The first, most obvious one is that we should never require a Codec
instance.
When a function needs to both encode and decode a type, it should require an instance of Encoder
and one of
Decoder
instead.
The second reason this is a bad idea, even if one where to “split” the codec into its encoder and decoder halves, is
that any type A
that has, say, an Encoder
but not a Decoder
should still get a free Encoder[Option[A]]
, but
this implementation prevents it.
Since encoder and decoder instances will eventually find themselves part of the same trait (CellCodecInstances
in our
examples), it’s important to make sure their names don’t clash: prefer stringDecoder
and stringEncoder
to the
simpler but unsafe string
.