Community Articles
Find and share helpful community-sourced technical articles.

often we have need to read the parquet file, parquet-meta data or parquet-footer, parquet tools is shipped with parquet-hadoop library which can help us to read parquet. these are simple steps to build parquet-tools and demonstrate use of it.

prerequisites: maven 3,git, jdk-7/8

// Building a parquet tools

git clone https://github.com/Parquet/parquet-mr.git 
cd parquet-mr/parquet-tools/ 
mvn clean package -Plocal 

// know the schema of the parquet file

java -jar parquet-tools-1.6.0.jar schema sample.parquet 

// Read parquet file

java -jar parquet-tools-1.6.0.jar cat sample.parquet 

// Read few lines in parquet file

java -jar parquet-tools-1.6.0.jar head -n5 sample.parquet 

// know the meta information of the parquet file

java -jar parquet-tools-1.6.0.jar meta sample.parquet
40,029 Views
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.
Version history
Last update:
‎12-23-2016 06:18 AM
Updated by:
Contributors
Top Kudoed Authors
; ;