Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Nifi ranger plugin not writing logs to hdfs

Highlighted

Nifi ranger plugin not writing logs to hdfs

Contributor

What I'm trying to achieve:

Integrate Ranger and NiFi so that nifi authorization can be governed by Ranger. I want Ranger audit logs to be flushed to HDFS as they generate (or with a minimum delay).

I do not really need nifi in secure mode at this point, but since nifi on http did not seem to be use ranger policies, I ended up making it https enabled.

I do not want the audit logs to go to Solr - it should just go to HDFS.

My setup:

HDP 2.6 sandbox on one centos instance & nifi 4 on another centos instance

My ranger is not ssl enabled i.e. can be accessed by http

My nifi is ssl enabled. Note: My nifi is standalone i.e. is not administered/installed using Ambari.

I also configured ranger as an authorizer in NiFi. I created a generous policy in Ranger granting access.

initially I did have Solr also as audit destination (apart from HDFS) but later I changed it to false in "ranger-nifi-audit.xml".

I do have hdfs as a destination configured in "ranger-nifi-audit.xml".

My observations:

Nifi is getting governed by Ranger policies. the ranger plugin also audits logs in spool directory

But audit log is not flushed to HDFS - even after wait of couple of hours.

Somehow I go impression that reducing "xasecure.audit.destination.hdfs.file.rollover.sec" will get faster updates to HDFS; which I found is wrong. It simply ends up creating lots of empty files.

I checked spool directory and it does have logs captured. I also see messages like following in the nifi-app.log periodically:

2017-11-06 18:59:56,901 INFO [org.apache.ranger.audit.queue.AuditBatchQueue0] o.a.r.audit.provider.BaseAuditHandler Audit Status Log: name=nifi.async.batch.hdfs, interval=01:00.011 minutes, events=10, succcessCount=10, totalEvents=125, totalSuccessCount=1252017-11-06 18:59:56,902 INFO [org.apache.ranger.audit.queue.AuditBatchQueue0] o.a.r.a.destination.HDFSAuditDestination Flushing HDFS audit. Event Size:5

But even after waiting for couple of hours, I do not see any audit log dumped into HDFS. Just empty file sitting there.

I do not see any errors about connection to HDFS in nifi-app.log

edit: My ranger-nifi-audit.xml is as below. Recently added log4j entry as per

https://community.hortonworks.com/questions/118294/can-nifi-ranger-plugin-audit-log-to-hdfs.html

But even that does not seem to capture logs through log4j.

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration xmlns:xi="http://www.w3.org/2001/XInclude">
  <property>
    <name>xasecure.audit.is.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>xasecure.audit.destination.solr</name>
    <value>false</value>
  </property>
  <property>
    <name>xasecure.audit.destination.solr.batch.filespool.dir</name>
    <value>/tmp/audit/solr/spool</value>
  </property>
  <property>
    <name>xasecure.audit.destination.solr.urls</name>
    <value>http://192.168.1.12:6083/solr/ranger_audits</value>
  </property>
  <property>
    <name>xasecure.audit.destination.hdfs</name>
    <value>true</value>
  </property>
  <property>
    <name>xasecure.audit.destination.hdfs.dir</name>
    <value>hdfs://sandbox.hortonworks.com:8020/ranger/audit</value>
  </property>
  <property>
    <name>xasecure.audit.destination.hdfs.subdir</name>
    <value>%app-type%/%time:yyyyMMdd%</value>
  </property>
  <property>
    <name>xasecure.audit.destination.hdfs.filename.format</name>
    <value>%app-type%_ranger_audit_%hostname%.log</value>
  </property>
  <property>
    <name>xasecure.audit.destination.hdfs.file.rollover.sec</name>
    <value>86400</value>
  </property>
 <property>
    <name>xasecure.audit.destination.hdfs.batch.filespool.dir</name>
    <value>/tmp/audit/hdfs/spool</value>
  </property>
 <property>
    <name>xasecure.audit.destination.hdfs.batch.interval.ms</name>
    <value>3000</value>
 </property>
 <property>
    <name>xasecure.audit.hdfs.config.destination.flush.interval.seconds</name>
    <value>900</value>
 </property>
<property>
                <name>xasecure.audit.hdfs.is.enabled</name>
                <value>true</value>
        </property>

        <property>
                <name>xasecure.audit.hdfs.is.async</name>
                <value>true</value>
        </property>

        <property>
                <name>xasecure.audit.hdfs.async.max.queue.size</name>
                <value>1048576</value>
        </property>

        <property>
                <name>xasecure.audit.hdfs.async.max.flush.interval.ms</name>
                <value>30000</value>
        </property>

 <property>
   <name>xasecure.audit.log4j.is.enabled</name>
   <value>true</value>
 </property>
 <property>
   <name>xasecure.audit.log4j.is.async</name>
   <value>true</value>
 </property>
 <property>
   <name>xasecure.audit.destination.log4j.logger</name>
   <value>ranger.audit</value>
 </property>

</configuration>

Question:

1) What I may be doing wrong that though the empty file is generated, no logs added into it? Also what do I need to do to instruct nifi-ranger plugin to flush logs to hdfs with a minimum/no delay?

2) Is it possible to use unsecured nifi to connect to unsecured Ranger and still work as normal?

Requesting your help, been stuck on this for quite log and not finding any articles/references addressing issue and questions I'm facing.

Note: My current setup is experimental only (with a view to take it forward) at this point.

4 REPLIES 4

Re: Nifi ranger plugin not writing logs to hdfs

Contributor

I also tried setting value of "xasecure.audit.destination.hdfs.queue" to "None"; but still audit log is not written to HDFS.

Re: Nifi ranger plugin not writing logs to hdfs

Contributor

@Ramesh Mani Saw your response on Ranger Audit on another thread. Please can you help, I'm stuck on this since couple of days?

Re: Nifi ranger plugin not writing logs to hdfs

Expert Contributor
@Prashant Chaudhari

you mentioned that ranger audits log to spool directory, do you mean that it just puts in local spool and not getting into HDFS?

Do you see any exceptions in the NIFI log related to ranger audit please check.

Re: Nifi ranger plugin not writing logs to hdfs

Contributor

Thanks @Ramesh Mani for your help. Yes, I mean it stores in local spool directory in proper JSON format. but unable to push it to HDFS.

As no errors were logged in nifi-app.log; I did try with log level DEBUG; which shown me this error messege:

"File /ranger/audit/nifi/20171106/nifi_ranger_audit_hdp2503.nodelogix.com.1.log could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.". I tried solution on this as suggested in "https://stackoverflow.com/questions/5293446/hdfs-error-could-only-be-replicated-to-0-nodes-instead-of-1", but to no avail. In fact for some strange reason, now I'm not being able to access my nifi (my nifi running on one centos 7 virtual machine on virtualbox while I was accessing it from browser on my host system).