case class HdfsWriteStrategy(ingestPath: Path) extends AccumuloWriteStrategy with Product with Serializable
This strategy will perfom Accumulo bulk ingest. Bulk ingest requires that sorted records be written to the filesystem, preferbly HDFS, before Accumulo is able to ingest them. After the ingest is finished the nodes will likely go through a period of high load as they perform major compactions.
Note: Giving relative URLs will cause HDFS to use the fs.defaultFS
property in core-site.xml
.
If not specified this will default to local ('file:/') system, this is undesriable.
- ingestPath
Path where spark will write RDD records for ingest
Linear Supertypes
Ordering
- Alphabetic
- By Inheritance
Inherited
- HdfsWriteStrategy
- Product
- Equals
- AccumuloWriteStrategy
- Serializable
- Serializable
- AnyRef
- Any
- Hide All
- Show All
Visibility
- Public
- All
Instance Constructors
-
new
HdfsWriteStrategy(ingestPath: Path)
- ingestPath
Path where spark will write RDD records for ingest
Value Members
- val ingestPath: Path
-
def
write(kvPairs: RDD[(Key, Value)], instance: AccumuloInstance, table: String): Unit
Requires that the RDD be pre-sorted
Requires that the RDD be pre-sorted
- Definition Classes
- HdfsWriteStrategy → AccumuloWriteStrategy