I would like your advice in running a Python script via NiFi. A have a script that will send a data access report (based on Apache Ranger data) to data owners every month. As it is now it needs a bunch of arguments and it will ask for a password with getpass. My fellow data engineers asked me to run it from NiFi, so we have scheduled things together in one tool.
I've experimented with NiFi a bit, but I can't seem to get the Python script to even run in a sandbox environment (HDF 3.1.1) in a ExecuteProcess processor.
I placed the Python script in /home/marcel-jan/ranger_rapport and even did a chmod 777 on the directory and script, but NiFi says:
'Working Directory' validated against '/home/marcel-jan/ranger_rapport' is invalid because Directory doesn't exist and could not be created.
I just don't get what's going wrong there.
I have a couple of questions:
Well it works, but here also I get the message that the requests module is not available. I used that to get data from the Apache Ranger REST API. It looks like I need to use NiFi to get that data and then continue from there on with Python.
Yes, you are right, That is what i meant. I haven't played with the sandbox, but the key would be to make sure that the user NiFi uses to run has access and permissions to the resources you are adding to the processor.
Executestreamprocessor will have the same issue if the path is wrong or non existent in the server.
I would first try to find out what user is used to run nifi (ps -ef would be your friend here). Then i would make sure that that user has access to the path in the console. (Use 'ls -l path') from the home directory. Path in this case would be both the path of the executable, and the path of the working directory (Make sure both are accessible)
Lastly, try to execute your script from the command line.