Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
Super Guru

Converting CSV Files to Apache Hive Tables with Apache ORC Files

I received some CSV files of data to load into Apache Hive. There are many ways to do this, but I wanted to see how easy it was to do in Apache NiFi with zero code.


I read CSV files from a directory of files. Then I can Convert the CSV to AVRO directly with ConvertRecord.


I will need a schema, so I use the below settings for InferAvroSchema. if ever file is different, you will need to do this every time.


CSV Reader


I use the Jackson CSV parser which works very well. The first line of the CSV is a header. It can figure out the fields from the header.

Once I have an Apache AVRO file it's easy to convert to Apache ORC and then store in HDFS.



Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.
Version history
Last update:
‎08-17-2019 07:38 AM
Updated by:
Top Kudoed Authors