Member since
11-16-2015
892
Posts
650
Kudos Received
245
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5668 | 02-22-2024 12:38 PM | |
1389 | 02-02-2023 07:07 AM | |
3087 | 12-07-2021 09:19 AM | |
4205 | 03-20-2020 12:34 PM | |
14163 | 01-27-2020 07:57 AM |
06-13-2018
02:41 PM
Thanks for response. Small correction.. From custom code the message is gonna publish to kafka queue from there I am picking the JSON message to pass to EvaluateJsonPath processor. the EvaluateJsonPath has now two values one is source path and one is destination path. As you said you can use FetchS3Object to get the file from S3, how should I pass the source path to FetchS3Object processor and then how should I pass the destination path to PutFile processor? Could you explain me briefly ? Right now my flow is like attached screen shot. PFA...
... View more
05-29-2018
09:09 PM
Hi guys, thanks so much for the fast support and thanks to the Matts Team @Matt Burgess and @Matt Clarke I finally understood how the processor works. He emits a flow file with no payload and in the meta attributes are the file details like path and filename. Those are used by the HDFSFetch to fetch the correspondent files. Kind regards, Paul
... View more
05-31-2018
10:19 AM
@Mike Wong Does the listFile exhibit the same behavior or does it list your file correctly? - The fact that the logs shows it the processor yielding tells me it found no work to do (meaning no files to list). It yields so that it does not consume not stop CPU looking for work that does not exist. - Did you check your properties for leading or trailing whitespace? Did you try removing the "\" from your file filter? - Thanks, Matt
... View more
12-06-2018
03:01 PM
Is there a way (without using execute script) to modify the processor_group variable in processor?
... View more
05-27-2018
06:51 AM
It works fine now I change within the replacetext to ${hive.ddl} location '${absolute.hdfs.path}'
... View more
05-21-2018
02:16 PM
For approach #1, you could use the FlattenJson processor, you'll likely want to set the Separator property to "_" rather than the default "." since Hive adds the table name to each column in a ResultSet. For approach #2, you could have a single column table (column of type String), then you'd query it with get_json_object (example here). Alternatively if you can map all the types (including the complex types like array, list, struct, etc.) to a Hive table definition, you could use a JSON SerDe to write the data (example here).
... View more
10-22-2018
02:44 PM
you have to go to path where nifi is installed.and that will be hdf folder which will not be found under root or your user. follow the steps to find
go to cmd and hit this command ---> sudo docker exec -it sandbox-hdf /bin/bash then
go to path
/usr/hdf/3.1.2.0-7/nifi
you will see nifi related folders. Now,
create you input directory
mkdir - inputdir
mkdir -outputdir that's it ! use this directories in your getfile and putfile processor.
Note: all folder permission should be NIFI as well as input data as well . Happy Learning! let me know if any issue posted on :Mon.Oct 22,2018
... View more
04-17-2018
02:01 PM
I am working on NIFI-4456 which will allow the JSON reader/writer to support the "one JSON per line" format as well as the "JSON array" format for input and output, so you will be able to read in one JSON per line and output a JSON array, using ConvertRecord (or any other record-aware processor). In the meantime, you can use the following crude script in an ExecuteGroovyScript processor to process your entire file (avoiding the Split/Merge pattern), it should get you what you want: def flowFile = session.get()
if(!flowFile) return
flowFile = session.write(flowFile, {inStream, outStream ->
outStream.write('['.bytes)
inStream.eachLine { line, i ->
if(i > 1) outStream.write(','.bytes)
outStream.write(line.bytes)
}
outStream.write(']'.bytes)
} as StreamCallback)
session.transfer(flowFile, REL_SUCCESS) The script just adds array brackets around the whole doc, and separates the lines by a comma. I did the crude version because it doesn't need to load the entire input content into memory. If you need more control over the JSON objects, you could iterate over the lines (still with eachLine), use JsonSlurper to deserialize each string into a JSON object, then add each object to an array, then use JsonOutput to serialize the whole thing back to a string. However that involves having the entire content in memory and could get unwieldy for large input flow files.
... View more
04-26-2018
02:58 PM
You only need one session per execution of the script. Using that session, you can get, create, remove, and transfer as many flow files as you want. If you get or create a flow file from the session, then you must transfer or remove it before the end of the script, or else you will get a "Transfer relationship not specified" error. Also you can only transfer each flow file once, if you attempt to transfer the same flow file more than once, you will get the error you describe above.
... View more
04-02-2018
12:34 PM
1 Kudo
I'm not sure if there is a function in JSONPath to retain the escaped quotes or not, but you could use UpdateAttribute (between EvaluateJsonPath and ReplaceText) along with the escapeJson function to "re-introduce" the quotes, by setting the "observation" attribute to the following value: ${observation:escapeJson()}
... View more