Support Questions

Find answers, ask questions, and share your expertise
Celebrating as our community reaches 100,000 members! Thank you!

Running a Python script from NiFi

Expert Contributor

I would like your advice in running a Python script via NiFi. A have a script that will send a data access report (based on Apache Ranger data) to data owners every month. As it is now it needs a bunch of arguments and it will ask for a password with getpass. My fellow data engineers asked me to run it from NiFi, so we have scheduled things together in one tool.

I've experimented with NiFi a bit, but I can't seem to get the Python script to even run in a sandbox environment (HDF 3.1.1) in a ExecuteProcess processor.


I placed the Python script in /home/marcel-jan/ranger_rapport and even did a chmod 777 on the directory and script, but NiFi says:

'Working Directory' validated against '/home/marcel-jan/ranger_rapport' is invalid because Directory doesn't exist and could not be created.

I just don't get what's going wrong there.

I have a couple of questions:

  • Am I using the right processor for this?
  • Is it possible to pass a securely stored password on from NiFi to a Python script? If not, what other steps do you take to keep things like this secure?


@Marcel-Jan Krijgsman
I don't have the same issue, cloning the same steps you've shown on your screenshots in a sandbox HDF. Are you sure you created the directory with the right name and in the right place? Ex when you connect to http://localhost:4200/ with root / hadoop , and do 'll /home/marcel-jan/ranger_rapport' it all looks OK?

View solution in original post


Expert Contributor

Well it works, but here also I get the message that the requests module is not available. I used that to get data from the Apache Ranger REST API. It looks like I need to use NiFi to get that data and then continue from there on with Python.


Yes, you are right, That is what i meant. I haven't played with the sandbox, but the key would be to make sure that the user NiFi uses to run has access and permissions to the resources you are adding to the processor.

Executestreamprocessor will have the same issue if the path is wrong or non existent in the server.

I would first try to find out what user is used to run nifi (ps -ef would be your friend here). Then i would make sure that that user has access to the path in the console. (Use 'ls -l path') from the home directory. Path in this case would be both the path of the executable, and the path of the working directory (Make sure both are accessible)

Lastly, try to execute your script from the command line.