accumulo

package accumulo

Ordering

Alphabetic

Visibility

Public
All

Type Members

class AccumuloLayerCopier extends LayerCopier[LayerId]
class AccumuloLayerManager extends LayerManager[LayerId]
class AccumuloLayerReader extends FilteringLayerReader[LayerId]
class AccumuloLayerReindexer extends LayerReindexer[LayerId]
class AccumuloLayerWriter extends LayerWriter[LayerId]
class AccumuloSparkLayerProvider extends AccumuloCollectionLayerProvider with LayerReaderProvider with LayerWriterProvider
Provides AccumuloAttributeStore instance for URI with accumulo scheme.
Provides AccumuloAttributeStore instance for URI with accumulo scheme. ex: accumulo://[user[:password]@]zookeeper/instance-name[?attributes=table1[&layers=table2]]
Attributes table name is optional, not provided default value will be used. Layers table name is required to instantiate a LayerWriter
sealed trait AccumuloWriteStrategy extends Serializable
case class HdfsWriteStrategy(ingestPath: Path) extends AccumuloWriteStrategy with Product with Serializable
This strategy will perfom Accumulo bulk ingest.
This strategy will perfom Accumulo bulk ingest. Bulk ingest requires that sorted records be written to the filesystem, preferbly HDFS, before Accumulo is able to ingest them. After the ingest is finished the nodes will likely go through a period of high load as they perform major compactions.
Note: Giving relative URLs will cause HDFS to use the fs.defaultFS property in core-site.xml. If not specified this will default to local ('file:/') system, this is undesriable.
ingestPath
Path where spark will write RDD records for ingest
class SocketWriteStrategy extends AccumuloWriteStrategy
This strategy will create one BatchWriter per partition and attempt to stream the records to the target tablets.
This strategy will create one BatchWriter per partition and attempt to stream the records to the target tablets. In order to gain some parallism this strategy will create a number of splits in the target table equal to the number of tservers in the cluster. This is suitable for smaller ingests, or where HdfsWriteStrategy is otherwise not possible.
This strategy will not create splits before starting to write. If you wish to do that use AccumuloUtils.getSplits first.
There is a problem in Accumulo 1.6 (fixed in 1.7) where the split creation does not wait for the resulting empty tablets to distribute through the cluster before returning. This will create a warm-up period where the pressure the ingest writers on that node will delay tablet re-balancing.
The speed of the ingest can be improved by setting tserver.wal.sync.method=hflush in accumulo shell. Note: this introduces higher chance of data loss due to sudden node failure.
BatchWriter is notified of the tablet migrations and will follow them around the cluster.

Value Members

object AccumuloLayerCopier
object AccumuloLayerMover
object AccumuloLayerReader
object AccumuloLayerReindexer
object AccumuloLayerWriter
object AccumuloRDDReader
object AccumuloRDDWriter
object AccumuloWriteStrategy extends Serializable
object HdfsWriteStrategy extends Serializable
object SocketWriteStrategy extends Serializable

Packages

accumulo

package accumulo

Type Members

Value Members

Ungrouped

Packages

accumulo 

package accumulo

Type Members

Value Members

Ungrouped

accumulo