Support Questions

Find answers, ask questions, and share your expertise

How to export parquet file to csv (without Hive)

avatar
Explorer

Hi,

I am developping a Nifi WebService to export dataLake content (stored as .parquet) as .csv.

I managed to do it using HiveQL Processor but I want to do it without Hive.

What I imagined was :

- get the .parquet file with WebHDFS (invokeHTTP call from nifi)

- use a nifi processor to convert the .parquet file to .csv

Is there a nifi Processor doing that? The only option I found for now is to use a spark job, which sounds a bit complicated for this purpose.

Thanks.

1 ACCEPTED SOLUTION

avatar
Master Guru

Currently there is nothing OOTB that will parse Parquet files in NiFi, but I have written NIFI-5455 to cover the addition of a ParquetReader, such that incoming Parquet files may be able to be operated on as other supported formats are. As a workaround, there is a ScriptedReader where you could write your own in Groovy, Javascript, Jython, etc.

View solution in original post

10 REPLIES 10

avatar
Rising Star

Thank you very much @Bryan Bende I need to insert each message into sql server. The database table has a clientno and jdonmessage field. I think I has to use Splitrecod to get the clientno from Json and insert the whole json record into the jsonmessage field.