Member since
03-01-2017
62
Posts
7
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3512 | 02-07-2019 02:28 PM |
01-24-2019
02:33 PM
We're using HDP-2.6.5. This type doesn't appear in the GUI when creating tags. Does is have to be created via the REST API? If so, is there any documentation on this?
... View more
10-29-2018
03:50 PM
Are there any plans, or maybe already existing features, that allow working hyperlinks in attributes? I'm asking this because I would link from Atlas to the official Mavim data catalog of our company. Now I can put the URL in the attribute, but it will still be a string of course.
... View more
Labels:
- Labels:
-
Apache Atlas
10-09-2018
11:06 AM
Well it works, but here also I get the message that the requests module is not available. I used that to get data from the Apache Ranger REST API. It looks like I need to use NiFi to get that data and then continue from there on with Python.
... View more
10-09-2018
09:36 AM
So when you connect to http://localhost:4200/ you get a different hostname than I get when going to sandbox-host.hortonworks.com with SSH and my script wasn't there. I found out that NiFi runs inside a Docker container in the virtual machine. In that sense it is indeed a different machine. So I placed my script in a directory via http://localhost:4200/ and now NiFi is able to find it.
... View more
10-09-2018
08:01 AM
It's almost like I'm on a different host with my Putty session. But I honestly have only one sandbox running.
... View more
10-09-2018
08:00 AM
Am I right to assume that with dropping into the console you mean starting a Putty session to look on the server? I've done that, but there isn't a nifi user I can su to. I had expected there to be one. I've also tried a GetFile processor to simply pick up the .py script. Same message: the directory doesn't exist. I get the same "Working directory doesn't exist" problems with the ExecuteStreamCommand processor.
... View more
10-09-2018
07:25 AM
Hi @Sammy Presaud I've tried this. I rewrote the Python code so that it won't need arguments and pasted it in the ExecuteScript property. And then it says "cannot use module requests". So I looked into that, and it turns out that you can't install libraries you're missing (https://community.hortonworks.com/questions/53645/cannot-use-numpy-or-scipy-in-python-in-nifi-execut.html) So I think that's off the table.
... View more
10-08-2018
03:07 PM
I think I'm getting why NiFi can't find the Python script. Could it be because the node name according to Ambari is sandbox-hdf.hortonworks.com and when I type hostname on the prompt I get sandbox-host.hortonworks.com?
... View more
10-08-2018
09:38 AM
1 Kudo
I would like your advice in running a Python script via NiFi. A have a script that will send a data access report (based on Apache Ranger data) to data owners every month. As it is now it needs a bunch of arguments and it will ask for a password with getpass. My fellow data engineers asked me to run it from NiFi, so we have scheduled things together in one tool.
I've experimented with NiFi a bit, but I can't seem to get the Python script to even run in a sandbox environment (HDF 3.1.1) in a ExecuteProcess processor. I placed the Python script in /home/marcel-jan/ranger_rapport and even did a chmod 777 on the directory and script, but NiFi says: 'Working Directory' validated against '/home/marcel-jan/ranger_rapport' is invalid because Directory doesn't exist and could not be created.
I just don't get what's going wrong there.
I have a couple of questions:
Am I using the right processor for this?
Is it possible to pass a securely stored password on from NiFi to a Python script? If not, what other steps do you take to keep things like this secure?
... View more
Labels:
- Labels:
-
Apache NiFi
09-06-2018
02:19 PM
1 Kudo
So you can basically widdle it down to: {"isEnabled":true,"service":"OPS_hadoop","name":"Test: /data/test2","policyType":0,"description":"Added automatically via the Ranger REST API","isAuditEnabled":true,"resources":{"path":{"values":["/data/test2"],"isExcludes":false,"isRecursive":true}},"policyItems":[{"accesses":[{"type":"read","isAllowed":true},{"type":"write","isAllowed":true},{"type":"execute","isAllowed":true}],"groups":["developers"],"conditions":[],"delegateAdmin":false}],"denyPolicyItems":[],"allowExceptions":[],"denyExceptions":[],"dataMaskPolicyItems":[],"rowFilterPolicyItems":[]} And this also works. curl -iv -u 203631 -H "Content-Type: application/json" -X POST https://servername:6801/gateway/ui/ranger/service/public/v2/api/policy -d '{"isEnabled":true,"service":"OPS_hadoop","name":"Test: /data/test2","policyType":0,"description":"Added automatically via the Ranger REST API","isAuditEnabled":true,"resources":{"path":{"values":["/data/test2"],"isExcludes":false,"isRecursive":true}},"policyItems":[{"accesses":[{"type":"read","isAllowed":true},{"type":"write","isAllowed":true},{"type":"execute","isAllowed":true}],"groups":["developers"],"conditions":[],"delegateAdmin":false}],"denyPolicyItems":[],"allowExceptions":[],"denyExceptions":[],"dataMaskPolicyItems":[],"rowFilterPolicyItems":[]}' Which means I can write a couple of these commands to prepare for a rollout. Cool!
... View more