Member since
05-04-2019
9
Posts
0
Kudos Received
0
Solutions
06-04-2019
09:52 AM
Waht If i wanted to put my parquet into an S£ instead of HDFS?
... View more
06-04-2019
09:50 AM
Hi I am trying to convert my Json to a Parquet file and then put in into an S3. My Json is 2 GB large. What I am doing is converting it to an Avro and then Parquet. However my Convert record processor is pending and doesn't seem to do any activity. Do you know why? Here is my Convert record Config:
... View more
Labels:
- Labels:
-
Apache NiFi
05-08-2019
07:19 PM
@Matt Clarke The initial files are 20 each of around 2gb. Regarding the merge content I left all default values, such as' bins number'= 5 and so on. Which parameters do you recommend to tune in the merge contents? Because I noticed that it takes time until the queue before the mergingit gets filled.
... View more
05-08-2019
09:37 AM
I tried what you exactly did. The merge processors look somehow not waiting for all FF to be queued but they merge as FF come. In this way it confuses the next merge. I proved it by starting a merge at a time, waiting for all FF. in this way works but it is not deployable.
... View more
05-07-2019
02:39 PM
@Matt Clarke Hi I tried what you suggested but with 3 splits. However the last merge thells that 'cannot merge because the fragment.index is not integer value'. Is there anything wrong in my flow?: last merge Update attr
... View more
05-06-2019
06:47 AM
I need do be honest, I might be the only one but it is still not clear how the wait/notify pattern works. Do you have any other resources?
... View more
05-05-2019
12:44 PM
I am having trouble with my dataflow using Nifi, Sure because I didn't understand how Wit/Notify Works. Here is my basic data flow: - I listed my s3 bucket containing zip files. - Extracted my Json file from the zip. Each Json is 2Gb up to 3 Gb. - I then split my json several times to avoid 'Out of memory' to obtain sing records FlowFiles(Here comes the trouble). - I want to flatten each flowfile - I want to merge my flattened FF to re obtain the original Json. I know I cannot do with 'Merge Content' as my json goes through several split processors. Can you please explain how to use the Wait/Notify? Does the wait automatically merges the flowfiles? Because I did not understand. I also looked at this post but it is still not clear. Thanks
... View more
Labels:
- Labels:
-
Apache NiFi