Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to export parquet file to csv (without Hive)

avatar
Explorer

Hi,

I am developping a Nifi WebService to export dataLake content (stored as .parquet) as .csv.

I managed to do it using HiveQL Processor but I want to do it without Hive.

What I imagined was :

- get the .parquet file with WebHDFS (invokeHTTP call from nifi)

- use a nifi processor to convert the .parquet file to .csv

Is there a nifi Processor doing that? The only option I found for now is to use a spark job, which sounds a bit complicated for this purpose.

Thanks.

1 ACCEPTED SOLUTION

avatar
Master Guru

Currently there is nothing OOTB that will parse Parquet files in NiFi, but I have written NIFI-5455 to cover the addition of a ParquetReader, such that incoming Parquet files may be able to be operated on as other supported formats are. As a workaround, there is a ScriptedReader where you could write your own in Groovy, Javascript, Jython, etc.

View solution in original post

10 REPLIES 10

avatar
Rising Star

Thank you very much @Bryan Bende I need to insert each message into sql server. The database table has a clientno and jdonmessage field. I think I has to use Splitrecod to get the clientno from Json and insert the whole json record into the jsonmessage field.