Package

vectorpipe

Permalink

package vectorpipe

VectorPipe is a library for mass conversion of Vector data into Mapbox VectorTiles. It is powered by GeoTrellis and Apache Spark.

Outline

GeoTrellis and Spark do most of our work for us. Writing a main function that uses VectorPipe need not contain much more than:

import geotrellis.proj4.WebMercator
import geotrellis.spark.tiling.{LayoutDefinition, ZoomedLayoutScheme}
import geotrellis.vectortile.VectorTile
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.SparkSession
import vectorpipe._

implicit val ss: SparkSession = ...

val layout: LayoutDefinition =
  ZoomedLayoutScheme.layoutForZoom(15, WebMercator.worldExtent, 512)

/* An ORC file containing OSM data. */
val path: String = "s3://path/to/data.orc"

osm.fromORC(path) match {
  case Failure(_) => { /* Handle the error. Was your path correct? */ }
  case Success((nodes, ways, relations)) => {

    val features: RDD[OSMFeature] =
      osm.features(nodes, ways, relations).geometries

    val featGrid: RDD[(SpatialKey, Iterable[OSMFeature])] =
      grid(Clip.byHybrid, layout, features)

    val tiles: RDD[(SpatialKey, VectorTile)] =
      vectortiles(Collate.byAnalytics, layout, featGrid)

    // further processing / output
}

/* Nicely stop Spark */
ss.stop()

Writing Portable Tiles

This method outputs VectorTiles to a directory structure appropriate for serving by a Tile Map Server. The VTs themselves are saved in the usual .mvt format, and so can be read by any other tool. The example that follows writes tiles from above to an S3 bucket:

import geotrellis.spark.io.s3._  // requires the `geotrellis-s3` library

/* How should a `SpatialKey` map to a filepath on S3? */
val s3PathFromKey: SpatialKey => String = SaveToS3.spatialKeyToPath(
  LayerId("sample", 1),  // Whatever zoom level it is
  "s3://some-bucket/catalog/{name}/{z}/{x}/{y}.mvt"
)

tiles.saveToS3(s3PathFromKey)

Writing a GeoTrellis Layer of VectorTiles

The disadvantage of the "Portable Tiles" approach is that there is no way to read the tiles back into a RDD[(SpatialKey, VectorTile)] and do Spark-based manipulation operations. To do that, the tiles have to be written as a "GeoTrellis Layer" from the get-go. The output of such a write are split and compressed files that aren't readable by other tools. This method compresses VectorTiles to about half the size of a normal .mvt.

import geotrellis.spark._
import geotrellis.spark.io._
import geotrellis.spark.io.file._    /* When writing to your local computer */
import org.apache.spark.storage.StorageLevel

/* IO classes */
val catalog: String = "/home/you/tiles/"  /* This must exist ahead of time! */
val store = FileAttributeStore(catalog)
val writer = FileLayerWriter(store)

/* Almost certainly necessary, to save Spark from repeating effort */
val persisted = tiles.persist(StorageLevel.MEMORY_AND_DISK_SER)

/* Dynamically determine the KeyBounds */
val bounds: KeyBounds[SpatialKey] =
  persisted.map({ case (key, _) => KeyBounds(key, key) }).reduce(_ combine _)

/* Construct metadata for the Layer */
val meta = LayerMetadata(layout, bounds)

/* Write the Tile Layer */
writer.write(LayerId("north-van", 15), ContextRDD(persisted, meta), ZCurveKeyIndexMethod)
Linear Supertypes
AnyRef, Any
Ordering
  1. Grouped
  2. Alphabetic
  3. By Inheritance
Inherited
  1. vectorpipe
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. case class LayerMetadata[K](layout: LayoutDefinition, bounds: KeyBounds[K])(implicit evidence$1: JsonFormat[K]) extends Product with Serializable

    Permalink

    Minimalist Layer-level metadata.

    Minimalist Layer-level metadata. Necessary for writing layers of VectorTiles.

Value Members

  1. object Clip

    Permalink

    Clipping Strategies.

  2. object Collate

    Permalink

    "Collator" or "Schema" functions which form VectorTiles from collections of GeoTrellis Features.

    "Collator" or "Schema" functions which form VectorTiles from collections of GeoTrellis Features. Any function can be considered a valid "collator" if it satisfies the type:

    collate: (Extent, Iterable[Feature[G,D]]) => VectorTile

    Usage

    Create a VectorTile from some collection of GeoTrellis Geometries:

    val tileExtent: Extent = ... // Extent of _this_ Tile
    val geoms: Iterable[Feature[Geometry, Map[String, String]]] = ...  // Some collection of Geometries
    
    val tile: VectorTile = Collate.withStringMetadata(tileExtent, geoms)

    Create a VectorTile via some custom collation scheme:

    def partition(f: Feature[G,D]): String = ...
    def metadata(d: D): Map[String, Value] = ...
    
    val tileExtent: Extent = ... // Extent of _this_ Tile
    val geoms: Iterable[Feature[G, D]] = ...  // Some collection of Geometries
    
    val tile: VectorTile = Collate.generically(tileExtent, geoms, partition, metadata)

    Writing your own Collator Function

    We provide a few defaults here, but any collation scheme is possible. Collation just refers to the process of organizing some Iterable collection of Geometries into various VectorTile Layers. Creating your own collator is done easiest with generically. It expects a partition function to guide Geometries into separate Layers, and a metadata transformation function.

    Partition Functions

    A valid partition function must be of the type:

    partition: Feature[G,D] => String

    The output String is the name of the Layer you'd like a given Feature to be relegated to. Notice that the entire Feature is available (i.e. both its Geometry and metadata), so that your partitioner can make fine-grained choices.

    Metadata Transformation Functions

    One of these takes your D type and transforms it into what VectorTiles expect:

    metadata: D => Map[String, Value]

    You're encouraged to review the Value sum-type in geotrellis.vectortile

    On Winding Order

    VectorTiles require that Polygon exteriors have clockwise winding order, and that interior holes have counter-clockwise winding order. These assume that the origin (0,0) is in the top-left corner.

    Any custom collator which does not call generically must correct for Polygon winding order manually. This can be done via the vectorpipe.winding function.

    But why correct for winding order at all? Well, OSM data makes no guarantee about what winding order its derived Polygons will have. We could correct winding order when our first RDD[OSMFeature] is created, except that its unlikely that the clipping process afterward would maintain our winding for all Polygons.

  3. object LayerMetadata extends Serializable

    Permalink
  4. def grid[D](clip: (Extent, Feature[Geometry, D], Predicates) ⇒ Option[Feature[Geometry, D]], logError: (((Extent, Feature[Geometry, D])) ⇒ String) ⇒ ((Extent, Feature[Geometry, D])) ⇒ Unit, ld: LayoutDefinition, rdd: RDD[Feature[Geometry, D]]): RDD[(SpatialKey, Iterable[Feature[Geometry, D]])]

    Permalink

    Given a particular Layout (tile grid), split a collection of Features into a grid of them indexed by SpatialKey.

    Given a particular Layout (tile grid), split a collection of Features into a grid of them indexed by SpatialKey.

    Clipping Strategies

    A clipping strategy defines how Geometries which stretch outside their associated bounding box should be reduced to better fit it. This is benefical, as it saves on storage for large, complex Geometries who only partially intersect some bounding box. The excess points will be cut out, but the "how" is a matter of weighing PROs and CONs in the context of the user's use-case. Several strategies come to mind:

    • Clip directly on the bounding box
    • Clip just outside the bounding box
    • Keep the nearest Point outside the bounding box, wherever it is
    • Custom clipping for each OSM Element type (building, etc)
    • Don't clip

    These clipping strategies are defined in vectorpipe.geom.Clip, where you can find further explanation.

    clip

    A function which represents a "clipping strategy".

    logError

    An IO function that will log any clipping failures.

    ld

    The LayoutDefinition defining the area to gridify.

  5. def logNothing[A](f: (A) ⇒ String): (A) ⇒ Unit

    Permalink

    Skip over some failure.

  6. def logToLog4j[A](f: (A) ⇒ String): (A) ⇒ Unit

    Permalink

    Log an error as an ERROR through Spark's default log4j.

  7. def logToStdout[A](f: (A) ⇒ String): (A) ⇒ Unit

    Permalink

    Log an error to STDOUT.

  8. package osm

    Permalink

    Types and functions unique to working with OpenStreetMap data.

  9. implicit val vectorTileCodec: AvroRecordCodec[VectorTile]

    Permalink

    Encode a VectorTile via Avro.

    Encode a VectorTile via Avro. This is the glue for Layer IO.

  10. def vectortiles[G <: Geometry, D](collate: (Extent, Iterable[Feature[G, D]]) ⇒ VectorTile, ld: LayoutDefinition, rdd: RDD[(SpatialKey, Iterable[Feature[G, D]])]): RDD[(SpatialKey, VectorTile)]

    Permalink

    Given a collection of GeoTrellis Features which have been associated with some SpatialKey and a "collation" function, form those Features into a VectorTile.

    Given a collection of GeoTrellis Features which have been associated with some SpatialKey and a "collation" function, form those Features into a VectorTile.

    See also

    vectorpipe.Collate

  11. def winding(p: Polygon): Polygon

    Permalink

    Ensure a geotrellis.vector.Polygon has the correct winding order to be used in a VectorTile.

Inherited from AnyRef

Inherited from Any

Actions

Functions to transform RDDs of Features along the pipeline.

Error Logging

Useful defaults for functions like vectorpipe.grid, where we wish to log small failures and skip them, instead of crashing the entire Spark job.

Utility Functions

Typeclass Instances

Ungrouped