Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Super Guru

often we have need to read the parquet file, parquet-meta data or parquet-footer, parquet tools is shipped with parquet-hadoop library which can help us to read parquet. these are simple steps to build parquet-tools and demonstrate use of it.

prerequisites: maven 3,git, jdk-7/8

// Building a parquet tools

git clone https://github.com/Parquet/parquet-mr.git 
cd parquet-mr/parquet-tools/ 
mvn clean package -Plocal 

// know the schema of the parquet file

java -jar parquet-tools-1.6.0.jar schema sample.parquet 

// Read parquet file

java -jar parquet-tools-1.6.0.jar cat sample.parquet 

// Read few lines in parquet file

java -jar parquet-tools-1.6.0.jar head -n5 sample.parquet 

// know the meta information of the parquet file

java -jar parquet-tools-1.6.0.jar meta sample.parquet
43,400 Views