Package org.apache.spark.sql.vectorized
Class ColumnarBatch
Object
org.apache.spark.sql.vectorized.ColumnarBatch
- All Implemented Interfaces:
- AutoCloseable
This class wraps multiple ColumnVectors as a row-wise table. It provides a row view of this
 batch so that Spark can access the data row by row. Instance of it is meant to be reused during
 the entire data loading process. A data source may extend this class with customized logic.
- 
Constructor SummaryConstructorsConstructorDescriptionColumnarBatch(ColumnVector[] columns) ColumnarBatch(ColumnVector[] columns, int numRows) Create a new batch from existing column vectors.
- 
Method SummaryModifier and TypeMethodDescriptionvoidclose()Called to close all the columns in this batch.voidCalled to close all the columns if their resources are freeable between batches.column(int ordinal) Returns the column at `ordinal`.org.apache.spark.sql.catalyst.InternalRowgetRow(int rowId) Returns the row in this batch at `rowId`.intnumCols()Returns the number of columns that make up this batch.intnumRows()Returns the number of rows for read, including filtered rows.Iterator<org.apache.spark.sql.catalyst.InternalRow>Returns an iterator over the rows in this batch.voidsetNumRows(int numRows) Sets the number of rows in this batch.
- 
Constructor Details- 
ColumnarBatch
- 
ColumnarBatchCreate a new batch from existing column vectors.- Parameters:
- columns- The columns of this batch
- numRows- The number of rows in this batch
 
 
- 
- 
Method Details- 
closepublic void close()Called to close all the columns in this batch. It is not valid to access the data after calling this. This must be called at the end to clean up memory allocations.- Specified by:
- closein interface- AutoCloseable
 
- 
closeIfFreeablepublic void closeIfFreeable()Called to close all the columns if their resources are freeable between batches. This is used to clean up memory allocated during columnar processing.
- 
rowIteratorReturns an iterator over the rows in this batch.
- 
setNumRowspublic void setNumRows(int numRows) Sets the number of rows in this batch.
- 
numColspublic int numCols()Returns the number of columns that make up this batch.
- 
numRowspublic int numRows()Returns the number of rows for read, including filtered rows.
- 
columnReturns the column at `ordinal`.
- 
getRowpublic org.apache.spark.sql.catalyst.InternalRow getRow(int rowId) Returns the row in this batch at `rowId`. Returned row is reused across calls.
 
-