- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Decompress .json.gz inside folder in NIFI
- Labels:
-
Apache NiFi
-
NiFi Registry
Created on ‎06-08-2022 07:35 PM - edited ‎06-08-2022 07:39 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I receive a response from InvokeHTTP as a folder->another_folder->file.json.gz and I want to do some operations on the json file using SplitJson processor, so, I need to decompress the file.json.gz
I tried to UpdateAttribute to rename the parent folder to be folder.zip then UnpackContent but it's not supporting .gz format
Then, I tried instead of UnpackContent, the ExecuteStreamCommand with unzip command with a plan to use PutFile then ListFile to get the file.json.gz and pass to SplitJson but I got this error in ExecuteStreamCommand step
Failed to write flow file to stdin due to Broken pipe: java.io.IOException: Broken pipe
What's the solution for this? I tried to set Connection Timeout propriety to 120s but still the same error, What is the best way to get file.json.gz, decompress it and pass it to SplitJson processor?
Created ‎06-08-2022 08:16 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@nada ,
Could you share a sample of the InvokeHTTP response and also the flow that you currently have?
Cheers,
André
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created ‎06-09-2022 03:24 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
the InvokeHTTP response:
from the queue list:
after I download:
Now I have a flow file in the nonzero status queue from ExecuteStreamCommand as follows
the whole cycle:
Created ‎06-09-2022 10:33 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can use the CompressContent processor to decompress gzip files.
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.16.2/org.apach...
Set "Mode" to "Decompress", "compression format" to "gzip", and "Update Filename" to "True".
If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post.
Thank you,
Matt
Created ‎06-12-2022 06:43 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Matt,
my incoming flowfile is basically a folder that contains another folder that contains a .json.jz file
(folder->another_folder->file.json.gz), CompressContent processor is meant for .gz incoming file..
How to recursively decompress so it decompresses the .json.jz that is part of the subfolder? or how to unpack the first folder without touching subfolders/files?
Created ‎06-22-2022 04:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@nada ,
Please check this solution: https://community.cloudera.com/t5/Community-Articles/Decompressing-nested-ZIP-files-in-NiFi/ta-p/346...
Cheers,
André
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.
