I have a 3 nodes HDF cluster(in AWS cloud) where NiFi is running across the cluster, i want to use a NiFi processor to trigger shell/python script on a remote machine (OnPremise) to perform certain action written in the shell/python script.
NiFi offers a variety of processors that support executing commands or scripts locally to the NiFi installation. In order to execute a script remotely to NiFi you would still need to have a local script that would be called by the processor that could then go perform that task.
@Dan Chaffelson The python script (is actually a big python code ~ 500 Lines) pulls data from a MS SQL Server and dumps data in CSV format to a specified location in that machine.
Actually, the whole use case is extracting Tabular Data from various SQL sources and putting them to HDFS (HDP), we are using NiFi to orchestrate the process. Hence want to instantiate and schedule the whole flow with NiFi only.
If you want to extract the data with HDF back into the Dataflow this is not going to be the best way to do it, but the steps I outlined above would probably work. With the requirement you describe, I would instead suggest you implement the data movement and conversion sections of your python script using a combination of SQL and TEXT processors, then just run the python script for the bits of logic needed using the ExecuteScript processor, then use HDF again to write it to the right location.
Remote executing a script and pulling the output is a more brittle process and doesn't leverage the flexibility and power of NiFi, I suspect you would spend more time troubleshooting workarounds then using the default processors that achieve the same things.
Hey @Dan Chaffelson
Is there a way we can connect to the remote machine with password instead of key. i was able to run to do ssh (with password) from backend. Can i do it from "ExecuteProcess" processor or any other processor ? thanks.
@Praveen Singh These processors just replicate whatever you could do with piping together bash commands, more or less - so if you want to use an unsecured cleartext password in the processor you could try sshpass or other shell plugins to enable it. Basically if you can solve the problem by piping together bash commands, you should be able to do it in the processor. But I wouldn't recommend using cleartext passwords, keys are far more secure.
Thanks @Matt Clarke It worked fine, i was able to execute the python script.
Is there a way we can get back the output data which is being generated by the python code, which is being saved to a directory in the remote machine ?
Standard out from your script is written to the content of the FlowFile generated by the ExecuteProcess NiFi processor. So perhaps just tweaking your script to write to standard out rather then a file on disk is all you need to do.
@Matt Clarke I am also trying to do a POC using ExecuteProcess to start a .sh file on a remote server. Right now I am just trying to move a file on remote server from one location to another. But I am getting an error (Host Key Verification Failed) on the Nifi Processor. I am able to do the same via terminal on host machine (on which Nifi is installed and running). What could be the issue here? Help!
ssh -t user@hostname 'mv ~/folder1/test.txt ~/folder2/' <-- I am able to do this successfully on terminal.
ExecuteProcess Properties :
Command: ssh Command Arguments: -i "~/.ssh" user@hostname 'mv ~/folder1/test.txt ~/folder2/' Batch Duration : No value set Redirect Error Stream : false Working Directory : No value set Argument Delimiter : No value set