Class VectorizedPageIterator
java.lang.Object
org.apache.iceberg.parquet.BasePageIterator
org.apache.iceberg.arrow.vectorized.parquet.VectorizedPageIterator
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.iceberg.parquet.BasePageIterator
BasePageIterator.IntIterator -
Field Summary
Fields inherited from class org.apache.iceberg.parquet.BasePageIterator
currentDL, currentRL, definitionLevels, desc, dictionary, hasNext, page, repetitionLevels, triplesCount, triplesRead, valueEncoding, values, writerVersion -
Constructor Summary
ConstructorsConstructorDescriptionVectorizedPageIterator(org.apache.parquet.column.ColumnDescriptor desc, String writerVersion, boolean setValidityVector) -
Method Summary
Modifier and TypeMethodDescriptionprotected voidinitDataReader(org.apache.parquet.column.Encoding dataEncoding, org.apache.parquet.bytes.ByteBufferInputStream in, int valueCount) protected voidinitDefinitionLevelsReader(org.apache.parquet.column.page.DataPageV1 dataPageV1, org.apache.parquet.column.ColumnDescriptor desc, org.apache.parquet.bytes.ByteBufferInputStream in, int triplesCount) protected voidinitDefinitionLevelsReader(org.apache.parquet.column.page.DataPageV2 dataPageV2, org.apache.parquet.column.ColumnDescriptor desc) intnextBatchDictionaryIds(org.apache.arrow.vector.IntVector vector, int expectedBatchSize, int numValsInVector, NullabilityHolder holder) Method for reading a batch of dictionary ids from the dictionary encoded data pages.booleanprotected voidreset()voidsetAllPagesDictEncoded(boolean allDictEncoded) Methods inherited from class org.apache.iceberg.parquet.BasePageIterator
currentPageCount, hasNext, initFromPage, initFromPage, initRepetitionLevelsReader, initRepetitionLevelsReader, setDictionary, setPage
-
Constructor Details
-
VectorizedPageIterator
public VectorizedPageIterator(org.apache.parquet.column.ColumnDescriptor desc, String writerVersion, boolean setValidityVector)
-
-
Method Details
-
setAllPagesDictEncoded
public void setAllPagesDictEncoded(boolean allDictEncoded) -
reset
protected void reset()- Overrides:
resetin classBasePageIterator
-
initDataReader
protected void initDataReader(org.apache.parquet.column.Encoding dataEncoding, org.apache.parquet.bytes.ByteBufferInputStream in, int valueCount) - Specified by:
initDataReaderin classBasePageIterator
-
producesDictionaryEncodedVector
public boolean producesDictionaryEncodedVector() -
initDefinitionLevelsReader
protected void initDefinitionLevelsReader(org.apache.parquet.column.page.DataPageV1 dataPageV1, org.apache.parquet.column.ColumnDescriptor desc, org.apache.parquet.bytes.ByteBufferInputStream in, int triplesCount) throws IOException - Specified by:
initDefinitionLevelsReaderin classBasePageIterator- Throws:
IOException
-
initDefinitionLevelsReader
protected void initDefinitionLevelsReader(org.apache.parquet.column.page.DataPageV2 dataPageV2, org.apache.parquet.column.ColumnDescriptor desc) throws IOException - Specified by:
initDefinitionLevelsReaderin classBasePageIterator- Throws:
IOException
-
nextBatchDictionaryIds
public int nextBatchDictionaryIds(org.apache.arrow.vector.IntVector vector, int expectedBatchSize, int numValsInVector, NullabilityHolder holder) Method for reading a batch of dictionary ids from the dictionary encoded data pages. Like definition levels, dictionary ids in Parquet are RLE/bin-packed encoded as well.
-