class SocketWriteStrategy extends AccumuloWriteStrategy
This strategy will create one BatchWriter per partition and attempt to stream the records to the target tablets. In order to gain some parallism this strategy will create a number of splits in the target table equal to the number of tservers in the cluster. This is suitable for smaller ingests, or where HdfsWriteStrategy is otherwise not possible.
This strategy will not create splits before starting to write. If you wish to do that use AccumuloUtils.getSplits first.
There is a problem in Accumulo 1.6 (fixed in 1.7) where the split creation does not wait for the resulting empty tablets to distribute through the cluster before returning. This will create a warm-up period where the pressure the ingest writers on that node will delay tablet re-balancing.
The speed of the ingest can be improved by setting tserver.wal.sync.method=hflush
in accumulo shell.
Note: this introduces higher chance of data loss due to sudden node failure.
BatchWriter is notified of the tablet migrations and will follow them around the cluster.
- Alphabetic
- By Inheritance
- SocketWriteStrategy
- AccumuloWriteStrategy
- Serializable
- Serializable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
SocketWriteStrategy(config: BatchWriterConfig = ..., runtime: ⇒ IORuntime = IORuntimeTransient.IORuntime)
- config
Configuration for the BatchWriters
Value Members
- val kwConfig: KryoWrapper[BatchWriterConfig]
-
def
write(kvPairs: RDD[(Key, Value)], instance: AccumuloInstance, table: String): Unit
- Definition Classes
- SocketWriteStrategy → AccumuloWriteStrategy