@InterfaceAudience.Public @InterfaceStability.Unstable public final class AsyncKuduScanner extends Object
This class is not synchronized as it's expected to be
used from a single thread at a time. It's rarely (if ever?) useful to
scan concurrently from a shared scanner using multiple threads. If you
want to optimize large table scans using extra parallelism, create a few
scanners through the KuduScanToken
API. Or use MapReduce.
There's no method in this class to explicitly open the scanner. It will open
itself automatically when you start scanning by calling nextRows()
.
Also, the scanner will automatically call close()
when it reaches the
end key. If, however, you would like to stop scanning before reaching the
end key, you must call close()
before disposing of the scanner.
Note that it's always safe to call close()
on a scanner.
A AsyncKuduScanner
is not re-usable. Should you want to scan the same rows
or the same table again, you must create a new one.
byte
arrays in argumentbyte[]
in argument will copy it.
For more info, please refer to the documentation of KuduRpc
.
String
s in argumentModifier and Type | Class and Description |
---|---|
static class |
AsyncKuduScanner.AsyncKuduScannerBuilder
A Builder class to build
AsyncKuduScanner . |
static class |
AsyncKuduScanner.ReadMode
The possible read modes for scanners.
|
static class |
AsyncKuduScanner.RowDataFormat
Expected row data format in scanner result set.
|
Modifier and Type | Method and Description |
---|---|
com.stumbleupon.async.Deferred<RowResultIterator> |
close()
Closes this scanner (don't forget to call this when you're done with it!).
|
long |
getBatchSizeBytes()
Returns the maximum number of bytes returned by the scanner, on each batch.
|
boolean |
getCacheBlocks()
Returns if this scanner was configured to cache data blocks or not.
|
long |
getKeepAlivePeriodMs() |
long |
getLimit()
Returns the maximum number of rows that this scanner was configured to return.
|
Schema |
getProjectionSchema()
Returns the projection schema of this scanner.
|
AsyncKuduScanner.ReadMode |
getReadMode()
Returns the ReadMode for this scanner.
|
ResourceMetrics |
getResourceMetrics()
Returns the
ResourceMetrics for this scanner. |
long |
getScanRequestTimeout()
Returns the scan request timeout for this scanner.
|
boolean |
hasMoreRows()
Tells if the last rpc returned that there might be more rows to scan.
|
boolean |
isClosed() |
com.stumbleupon.async.Deferred<Void> |
keepAlive()
Keep the current remote scanner alive.
|
com.stumbleupon.async.Deferred<RowResultIterator> |
nextRows()
Scans a number of rows.
|
void |
setReuseRowResult(boolean reuseRowResult)
If set to true, the
RowResult object returned by the RowResultIterator
will be reused with each call to Iterator.next() . |
void |
setRowDataFormat(AsyncKuduScanner.RowDataFormat rowDataFormat)
Optionally set expected row data format.
|
String |
toString() |
public long getLimit()
public boolean hasMoreRows()
public boolean getCacheBlocks()
public long getBatchSizeBytes()
public AsyncKuduScanner.ReadMode getReadMode()
public long getScanRequestTimeout()
public Schema getProjectionSchema()
public long getKeepAlivePeriodMs()
public ResourceMetrics getResourceMetrics()
ResourceMetrics
for this scanner. These metrics are
updated with each batch of rows returned from the server.public void setReuseRowResult(boolean reuseRowResult)
RowResult
object returned by the RowResultIterator
will be reused with each call to Iterator.next()
.
This can be a useful optimization to reduce the number of objects created.
Note: DO NOT use this if the RowResult is stored between calls to next().
Enabling this optimization means that a call to next() mutates the previously returned
RowResult. Accessing the previously returned RowResult after a call to next(), by storing all
RowResults in a collection and accessing them later for example, will lead to all of the
stored RowResults being mutated as per the data in the last RowResult returned.public void setRowDataFormat(AsyncKuduScanner.RowDataFormat rowDataFormat)
rowDataFormat
- Row data format to be expected.public com.stumbleupon.async.Deferred<RowResultIterator> nextRows()
Once this method returns null
once (which indicates that this
Scanner
is done scanning), calling it again leads to an undefined
behavior.
public boolean isClosed()
public com.stumbleupon.async.Deferred<RowResultIterator> close()
Closing a scanner already closed has no effect. The deferred returned will be called back immediately.
Object
can be null, a RowResultIterator if there was data left
in the scanner, or an Exception.public com.stumbleupon.async.Deferred<Void> keepAlive()
Keep the current remote scanner alive on the Tablet server for an additional time-to-live. This is useful if the interval in between nextRows() calls is big enough that the remote scanner might be garbage collected. The scanner time-to-live can be configured on the tablet server via the --scanner_ttl_ms configuration flag and has a default of 60 seconds.
This does not invalidate any previously fetched results.
Note that an error returned by this method should not be taken as indication that the scan has failed. Subsequent calls to nextRows() might still be successful, particularly if the scanner is configured to be fault tolerant.
IllegalStateException
- if the scanner is already closed.