Support Questions

Find answers, ask questions, and share your expertise

How can i convert a any file in to Parquet format using NiFi?

avatar

How can i convert a any file in to Parquet format using NiFi?

i want convert for example CSV File to Parquet file using Apache NiFi

1 ACCEPTED SOLUTION

avatar
@Mohamed Ashraf

Following your question, I wrote an article on how to use PutParquet to convert data. Check it out to have a better understanding on the process.

https://community.hortonworks.com/articles/140422/convert-data-from-jsoncsvavro-to-parquet-with-nifi...

I hope this helps

View solution in original post

8 REPLIES 8

avatar
@Mohamed Ashraf

Have tried PutParquet : https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-parquet-nar/1.4.0/org.apache....

It's a record based processor that you can use to read CSV file with CSVReader and write it as parquet

I hope this helps

avatar

@Abdelkrim Hadjidj

I did it but no convert happens

avatar
Master Guru

What do you mean by "no convert happens?" PutParquet should write Parquet file(s) to the HDFS directory you configured in the processor. I believe the incoming flow file is the one transferred to the "success" relationship once the converted file has been successfully written to HDFS, not the converted file. For that I imagine there would have to be a ParquetRecordSetWriter and you'd use ConvertRecord instead of PutParquet.

avatar

I have CSV File , I want convert this file to parquet

this is my steps:-

1- GetFile

2-PutParquet

3-PutHDFS

After these steps, the file put into the HDFS but is not converted.

avatar
@Mohamed Ashraf

What configuration are you using? what error do you have?

avatar

@Abdelkrim Hadjidj

I have CSV File , i want convert this file to Parquet.

I did these steps and the operation did not work.

1- GetFile

40768-get-file.png

40769-get-file-pro.png

2- PutParquet

40770-put-parquet.png

40771-put-parquet-pro.png

3- PutHDFS

40772-put-hdfs.png

After these steps, the file put into the HDFS but is not converted.

avatar

@Mohamed Ashraf I don't have the possibility to test your scenario right now but the PutParquet should write the parquet file directly on HDFS so no need to PutHDFS.

What do you have in /user/nifi ? what directory have you configured with PutHDFS ?

avatar
@Mohamed Ashraf

Following your question, I wrote an article on how to use PutParquet to convert data. Check it out to have a better understanding on the process.

https://community.hortonworks.com/articles/140422/convert-data-from-jsoncsvavro-to-parquet-with-nifi...

I hope this helps