Decoding nodes as primitive types

The simplest possible use of kantan.xpath is to extract primitive types from XML documents.

In order to show how that works, we’ll first need some sample XML data, which we’ll get from this project’s resources:

val rawData: java.net.URL = getClass.getResource("/simple.xml")

This is what we’re working with:

scala.io.Source.fromURL(rawData).mkString
// res0: String = """<root>
//     <element id="1" enabled="true"/>
//     <element id="2" enabled="false"/>
//     <element id="3" enabled="true"/>
//     <element id="4" enabled="false"/>
// </root>"""

We’ll then need to import kantan.xpath’s syntax, which will let us evaluate XPath expressions directly on something that can be turned into an XML document:

import kantan.xpath.implicits._

This allows us to write the following code, which will attempt to extract the id field of any element node as an Int:

rawData.evalXPath[Int](xp"//element/@id")
// res1: kantan.xpath.package.XPathResult[Int] = Right(value = 1)

There are a few things worth pointing out here. First, the return type: you might expect an Int, since this is what you requested from evalXPath, but we got an XPathResult[Int] instead. An XPathResult is either a failure if something went wrong (the XPath expression is not valid, the id field is not a valid Int….) or a success otherwise. This mechanism ensures that evalXPath is safe: no exception will be thrown and break the flow of your code. For example:

rawData.evalXPath[java.net.URL](xp"//element/@id")
// res2: kantan.xpath.package.XPathResult[java.net.URL] = Left(
//   value = TypeError(message = "'1' is not a valid URL")
// )

In some cases, however, we don’t really care for runtime safety and are fine with our program crashing at the first error. This is what the unsafeEvalXPath method was designed for:

rawData.unsafeEvalXPath[Int](xp"//element/@id")
// res3: Int = 1

Another point of interest is that the sample XML file contained multiple element nodes, but we only got the id attribute of the first one. This is due to the type parameter we passed to evalXPath: by requesting a non-collection type, we told kantan.xpath that we only wanted the first result. We could get them all by requesting a List[Int], for example:

rawData.evalXPath[List[Int]](xp"//element/@id")
// res4: kantan.xpath.package.XPathResult[List[Int]] = Right(
//   value = List(1, 2, 3, 4)
// )

Any type constructor that has a CanBuildFrom instance could have been used instead of List - that’s essentially all collections. By the same token, any primitive time could have been used instead of Int. For example:

rawData.evalXPath[Vector[Boolean]](xp"//element/@enabled")
// res5: kantan.xpath.package.XPathResult[Vector[Boolean]] = Right(
//   value = Vector(true, false, true, false)
// )

Other tutorials: