About Shu_ashu

Shu_ashu · ‎04-26-2018

@Atif Tariq No Probs ..!! If the Answer addressed your question, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.

ijgour · ‎07-30-2019

@Shu Wanted to know, to add the "Validation query" is really mandatory to have to run the controller service or not?

Shu_ashu · ‎04-25-2018

@Raj ji You can use Execute Process (or) Execute Stream Command processors to pass arguments to the shell script. Execute Process Processor:- This processor won't need any upstream connections to trigger the script i.e this processor can run its own based on the schedular. Example:- I'm having sample script which gets 2 command line arguments and echo output. bash$ cat sample_script.sh #!/bin/bash echo "First arg: $1" echo "Second arg: $2" Execution in terminal:- bash$ ./sample_script.sh hello world First arg: hello Second arg: world 1.Execution in NiFi using ExecuteProcess Processor:- Command bash Command Arguments /tmp/sample_script.sh hello world //here we are triggering the shell script and passing arguments with space Batch Duration No value set Redirect Error Stream false Argument Delimiter space //by default if Argument Delimiter is ; then command arguments would be /tmp/sample_script.sh;hello;world Configs:- Success relation from ExecuteProcess will output the below as content of flowfile First arg: hello Second arg: world 2.Execution in NiFi using ExecuteStreamCommand processor:- This processor needs some upstream connection to trigger the script. Flow:- We have used generateflowfile processor as a trigger to ExecuteStreamCommand script Generateflowfile Configs:- Added two attributes arg1,arg2 to the flowfile ExecuteStreamCommand processor:- Command Arguments ${arg1};${arg2} Command Path /tmp/sample_script.sh Argument Delimiter ; Now we are using the attributes that added in generateflowfile processor and passing them to the script. Use the OutputStream relation from ExecuteStreamCommand processor and the output flowfile content would be same First arg: hello Second arg: world By using these processors you can trigger the shell script and pass the arguments also. - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.

Shu_ashu · ‎04-24-2018

@Prakhar Agrawal Connect with "Group1_Port1" input port that are also placed inside my "Group1" process group? It's not possible because Group_port1 is an input port inside processor group but Root_port1 is an remote processor group because it is on root canvas level. Root_port1,Root_port2 is used to distribute the data across NiFi instance that's the reason why you are able to see all the root canvas input ports(Root_port1,Root_port2) in the drop down list. Group_port1 is an input port which is used to transfer the data into process group. Please refer to this link to understand more about process groups and remote process groups.

siddarth_wardha · ‎04-21-2018

@ShuThank you so much. First approach worked. INSERT INTO Target_table(col_1, col_2, col_3) SELECT col_1, col_2,int(null) col_3 FROM Source_table;

srinatha_ananth · ‎01-23-2019

@Shu , I am trying to upload the above template but I am getting below error Error : "Found bundle org.apache.nifi:nifi-update-attribute-nar:1.6.0 but does not support org.apache.nifi.processors.attributes.UpdateAttribute" Could you please confirm if we need nifi-update-attribute-nar nar file? In my requirement, I am joining 5 tables to retrieve incremental data based record_create_date every second data is populated on these tables, I need to retrieve the data incrementally and flowfile should remember the last record_create_date it successfully pulled. in the above example if I query e.joindate >'${stored.state}' and e.joindate >'${current.state}' (it has current time), it will never fetch new records, right? For distributed cache it is asking for Server Hostname and port, what should be the server for this? Where I am setting the last fetched date (joindate) to ${stored.state} Could you please clarify me on my doubt? Thanks, ~Sri

chaitanya_chenn · ‎12-11-2018

I have used jolt transformer to transform json to json. convertAvroToJSON => JoltTransformerJSON => .... Here is sample specification [ { "operation": "default", "spec": { "*": "XXXXX" } }, { "operation": "shift", "spec": { "*": { "XXXXX": null, "*": { "@1": "&2" } } } } ]

aliyesami · ‎05-02-2018

hi Shu you wrote and you are right ..its importing only one record with m1 on each run . I tried 4 sqoop loads and got 4 versions. that means sqoop import to hbase table will not store all the versions but from hbase shell it will store all the versions) is it a bug or feature ? is Hortonworks aware of this and what is their comment ?

Shu_ashu · ‎04-18-2018

@Hemu Singh For this use case you need to use Query Record processor and Based on the Record Reader controller services configured this processor will execute sql queries on the Flowfile Contents and The result of the SQL query then becomes the content of the output FlowFile in the format as specified in the Record Writer controller service. Flow:- Flow Explanation:- 1.Generate Flowfile //added some test data 2.UpdateAttribute //added schema to the flowfile 3.Filter Column QueryRecord // 3.1.need to configure/enable Record Reader/Writer controller services. 3.2.added new property that will run sql where query on the flowfile 3.3.Use the original relation to store the file as is i.e having 100 records in it. 4.QueryRecord // 4.1.add two new properties that can run row_number window function(i'm having id column in the flowfile) and get first 75 records in one relation first 75 records select * from (select *,Row_Number() over(order by id asc) as rn from FLOWFILE) r where r.rn <= 75 76 to 100 record select * from (select *,Row_Number() over(order by id asc) as rn from FLOWFILE) r where r.rn > 75 and r.rn <= 100 Use the above two relations first75records and 76to100 record relationships for further processing. In addition Query Record supports Limit offset..etc also so you can use either row_number/limit offset ..etc to get only the desired 75 records from the flowfile. Please refer to this and this for QueryRecord processor configs and usage.

hello2 · ‎04-18-2018

Thanks @Shu, that's a very comprehensive explanation. I'll see what I can make from this. Chris.

Online	Offline
Last Visited	‎04-04-2021 06:38 PM

Member Since	‎06-08-2017 08:15 PM
Last Visited	‎04-04-2021 06:38 PM
Posts	1,049
Kudos received	516

Cloudera Community

Re: Get column values in comma separated value

Re: nifi Json data using routeonattributeto to spl...

Re: HIVE MANAGED TABLE

Re: CSV file with Duplicate Headers

Re: NIFI - SQL Server Lookup

Re: How i can set getFile Path for Nifi.

Re: nifi hive controller service failing daily

Re: how to pass arguments for shellscript in Nifi ...

Re: How to use input port inside a process group i...

Re: Difference in Hive query execution via Beeline...

Re: Can executesql processor store state for incr...

Re: How to search and remove null values in flow f...

Re: HBASE only storing 3 versions

Re: Apache NiFi to split incoming data from a file...

Re: NiFi AttributeRollingWindow 1.5.0 - start_curr...