Object

geotrellis.spark.io.s3

S3GeoTiffRDD

Related Doc: package s3

Permalink

object S3GeoTiffRDD extends LazyLogging

The S3GeoTiffRDD object allows for the creation of whole or windowed RDD[(K, V)]s from files on S3.

Linear Supertypes
LazyLogging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. S3GeoTiffRDD
  2. LazyLogging
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. case class Options(tiffExtensions: Seq[String] = ..., crs: Option[CRS] = None, timeTag: String = GEOTIFF_TIME_TAG_DEFAULT, timeFormat: String = GEOTIFF_TIME_FORMAT_DEFAULT, maxTileSize: Option[Int] = Some(DefaultMaxTileSize), numPartitions: Option[Int] = None, partitionBytes: Option[Long] = Some(DefaultPartitionBytes), chunkSize: Option[Int] = None, delimiter: Option[String] = None, getS3Client: () ⇒ S3Client = () => S3Client.DEFAULT) extends RasterReader.Options with Product with Serializable

    Permalink

    This case class contains the various parameters one can set when reading RDDs from S3 using Spark.

    This case class contains the various parameters one can set when reading RDDs from S3 using Spark.

    TODO: Add persistLevel option

    tiffExtensions

    Read all file with an extension contained in the given list.

    crs

    Override CRS of the input files. If None, the reader will use the file's original CRS.

    timeTag

    Name of tiff tag containing the timestamp for the tile.

    timeFormat

    Pattern for java.time.format.DateTimeFormatter to parse timeTag.

    maxTileSize

    Maximum allowed size of each tiles in output RDD. May result in a one input GeoTiff being split amongst multiple records if it exceeds this size. If no maximum tile size is specific, then each file file is read fully. 1024 by defaut.

    numPartitions

    How many partitions Spark should create when it repartitions the data.

    partitionBytes

    Desired partition size in bytes, at least one item per partition will be assigned. This option is incompatible with the maxTileSize option. 128 Mb by default.

    chunkSize

    How many bytes should be read in at a time.

    delimiter

    Delimiter to use for S3 objet listings. See

    getS3Client

    A function to instantiate an S3Client. Must be serializable.

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final val GEOTIFF_TIME_FORMAT_DEFAULT: String("yyyy:MM:dd HH:mm:ss")

    Permalink
  5. final val GEOTIFF_TIME_TAG_DEFAULT: String("TIFFTAG_DATETIME")

    Permalink
  6. object Options extends Serializable

    Permalink
  7. def apply[I, K, V](objectRequestsToDimensions: RDD[(GetObjectRequest, (Int, Int))], uriToKey: (URI, I) ⇒ K, options: Options, sourceGeoTiffInfo: ⇒ GeoTiffInfoReader)(implicit rr: RasterReader[Options, (I, V)]): RDD[(K, V)]

    Permalink

    Creates a RDD[(K, V)] whose K and V depends on the type of the GeoTiff that is going to be read in.

    Creates a RDD[(K, V)] whose K and V depends on the type of the GeoTiff that is going to be read in.

    objectRequestsToDimensions

    A RDD of GetObjectRequest of a given GeoTiff and its cols and rows as a (Int, Int).

    uriToKey

    function to transform input key basing on the URI information.

    options

    An instance of Options that contains any user defined or default settings.

  8. def apply[K, V](bucket: String, prefix: String, options: Options)(implicit sc: SparkContext, rr: RasterReader[Options, (K, V)]): RDD[(K, V)]

    Permalink

    Creates a RDD[(K, V)] whose K and V on the type of the GeoTiff that is going to be read in.

    Creates a RDD[(K, V)] whose K and V on the type of the GeoTiff that is going to be read in.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

    options

    An instance of Options that contains any user defined or default settings.

  9. def apply[I, K, V](bucket: String, prefix: String, uriToKey: (URI, I) ⇒ K, options: Options)(implicit sc: SparkContext, rr: RasterReader[Options, (I, V)]): RDD[(K, V)]

    Permalink

    Creates a RDD[(K, V)] whose K and V on the type of the GeoTiff that is going to be read in.

    Creates a RDD[(K, V)] whose K and V on the type of the GeoTiff that is going to be read in.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

    uriToKey

    Function to transform input key basing on the URI information.

    options

    An instance of Options that contains any user defined or default settings.

  10. def apply[I, K, V](bucket: String, prefix: String, uriToKey: (URI, I) ⇒ K, options: Options, geometry: Option[Geometry])(implicit sc: SparkContext, rr: RasterReader[Options, (I, V)]): RDD[(K, V)]

    Permalink

    Creates a RDD[(K, V)] whose K and V on the type of the GeoTiff that is going to be read in.

    Creates a RDD[(K, V)] whose K and V on the type of the GeoTiff that is going to be read in.

    This function has two modes of operation: When options.maxTileSize is set windows will be read from GeoTiffs and their size and count will be balanced among partitions using partitionBytes option. Resulting partitions will be grouped in relation to GeoTiff segment layout.

    When maxTileSize is None the GeoTiffs will be read fully and balanced among partitions using either numPartitions or partitionBytes option.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

    uriToKey

    Function to transform input key basing on the URI information.

    options

    An instance of Options that contains any user defined or default settings.

    geometry

    An optional geometry to filter by. If this is provided, it is assumed that all GeoTiffs are in the same CRS, and that this geometry is in that CRS.

  11. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  12. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  13. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  14. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  15. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  16. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  17. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  18. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  19. lazy val logger: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    LazyLogging
  20. def multiband[K](bucket: String, prefix: String, options: Options)(implicit sc: SparkContext, rr: RasterReader[Options, (K, MultibandTile)]): RDD[(K, MultibandTile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband GeoTiffs.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband GeoTiffs. If a GeoTiff contains multiple bands, only the first will be read.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

  21. def multiband[I, K](bucket: String, prefix: String, uriToKey: (URI, I) ⇒ K, options: Options)(implicit sc: SparkContext, rr: RasterReader[Options, (I, MultibandTile)]): RDD[(K, MultibandTile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband GeoTiffs.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband GeoTiffs. If a GeoTiff contains multiple bands, only the first will be read.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

    uriToKey

    function to transform input key basing on the URI information.

  22. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  23. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  24. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  25. def singleband[K](bucket: String, prefix: String, options: Options)(implicit sc: SparkContext, rr: RasterReader[Options, (K, Tile)]): RDD[(K, Tile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband GeoTiffs.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband GeoTiffs. If a GeoTiff contains multiple bands, only the first will be read.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

  26. def singleband[I, K](bucket: String, prefix: String, uriToKey: (URI, I) ⇒ K, options: Options)(implicit sc: SparkContext, rr: RasterReader[Options, (I, Tile)]): RDD[(K, Tile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband GeoTiffs.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband GeoTiffs. If a GeoTiff contains multiple bands, only the first will be read.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

    uriToKey

    function to transform input key basing on the URI information.

  27. def spatial(bucket: String, prefix: String, uriToKey: (URI, ProjectedExtent) ⇒ ProjectedExtent, options: Options)(implicit sc: SparkContext): RDD[(ProjectedExtent, Tile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles. If a GeoTiff contains multiple bands, only the first will be read.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

    uriToKey

    function to transform input key basing on the URI information.

    options

    An instance of Options that contains any user defined or default settings.

  28. def spatial(bucket: String, prefix: String, options: Options)(implicit sc: SparkContext): RDD[(ProjectedExtent, Tile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles. If a GeoTiff contains multiple bands, only the first will be read.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

    options

    An instance of Options that contains any user defined or default settings.

  29. def spatial(bucket: String, prefix: String)(implicit sc: SparkContext): RDD[(ProjectedExtent, Tile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband GeoTiffs.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband GeoTiffs. If a GeoTiff contains multiple bands, only the first will be read.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

  30. def spatialMultiband(bucket: String, prefix: String, uriToKey: (URI, ProjectedExtent) ⇒ ProjectedExtent, options: Options)(implicit sc: SparkContext): RDD[(ProjectedExtent, MultibandTile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles. If a GeoTiff contains multiple bands, only the first will be read.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

    uriToKey

    function to transform input key basing on the URI information.

    options

    An instance of Options that contains any user defined or default settings.

  31. def spatialMultiband(bucket: String, prefix: String, options: Options)(implicit sc: SparkContext): RDD[(ProjectedExtent, MultibandTile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

    options

    An instance of Options that contains any user defined or default settings.

  32. def spatialMultiband(bucket: String, prefix: String)(implicit sc: SparkContext): RDD[(ProjectedExtent, MultibandTile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

  33. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  34. def temporal(bucket: String, prefix: String, uriToKey: (URI, TemporalProjectedExtent) ⇒ TemporalProjectedExtent, options: Options)(implicit sc: SparkContext): RDD[(TemporalProjectedExtent, Tile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles. Will parse a timestamp from a tiff tags specified in options to associate with each tile.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

    uriToKey

    function to transform input key basing on the URI information.

    options

    Options for the reading process. Including the timestamp tiff tag and its pattern.

  35. def temporal(bucket: String, prefix: String, options: Options)(implicit sc: SparkContext): RDD[(TemporalProjectedExtent, Tile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles. Will parse a timestamp from a tiff tags specified in options to associate with each tile.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

    options

    Options for the reading process. Including the timestamp tiff tag and its pattern.

  36. def temporal(bucket: String, prefix: String)(implicit sc: SparkContext): RDD[(TemporalProjectedExtent, Tile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as singleband tiles. Will parse a timestamp from the default tiff tags to associate with each file.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

  37. def temporalMultiband(bucket: String, prefix: String, uriToKey: (URI, TemporalProjectedExtent) ⇒ TemporalProjectedExtent, options: Options)(implicit sc: SparkContext): RDD[(TemporalProjectedExtent, MultibandTile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles. Will parse a timestamp from a tiff tags specified in options to associate with each tile.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

    uriToKey

    function to transform input key basing on the URI information.

    options

    Options for the reading process. Including the timestamp tiff tag and its pattern.

  38. def temporalMultiband(bucket: String, prefix: String, options: Options)(implicit sc: SparkContext): RDD[(TemporalProjectedExtent, MultibandTile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles. Will parse a timestamp from a tiff tags specified in options to associate with each tile.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

    options

    Options for the reading process. Including the timestamp tiff tag and its pattern.

  39. def temporalMultiband(bucket: String, prefix: String)(implicit sc: SparkContext): RDD[(TemporalProjectedExtent, MultibandTile)]

    Permalink

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles.

    Creates RDD that will read all GeoTiffs in the given bucket and prefix as multiband tiles. Will parse a timestamp from a tiff tags specified in options to associate with each tile.

    bucket

    Name of the bucket on S3 where the files are kept.

    prefix

    Prefix of all of the keys on S3 that are to be read in.

  40. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  41. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  42. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  43. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from LazyLogging

Inherited from AnyRef

Inherited from Any

Ungrouped