About bkosaraju

bkosaraju · ‎09-20-2017

Hi @Bruno Lavoie, SPNEGO works with the SOLR Admin UI, the following URL help to configure the browser to use Spnego https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_security/content/enabling_browser_access_spnego_web_ui.html with a valid ticket admin should negotiate, on the other note please have a close look at the security configuration.

bkosaraju · ‎09-20-2017

hi @Alvin Jin, to ge kinit working you need to install krb5-workstation(Centos and RedHat), and make sure that you have updated your /etc/krb5.conf (should have your KDC server Realm - best to copy from the cluster ) on your second question: Kerberos ticket will have your identity hence, you don't need again to make two way ssl ( -k option will use simple method for curl) on the other note, if you want to go with user name and password (LDAP/file-based provider), token can be obtained using following command. curl -k 'https://<nifi-server>:9091/nifi-api/access/token' -H 'Accept-Encoding: gzip, deflate, br' -H 'Content-Type: application/x-www-form-urlencoded; charset=UTF-8' -H 'Accept: */*' --data 'username=<username>&password=<password>' --compressed

bkosaraju · ‎09-20-2017

hi @Alvin Jin, Please ensure that, you enable kerberos-provider section in login-identity-providers.xml <provider> <identifier>kerberos-provider</identifier> <class>org.apache.nifi.kerberos.KerberosProvider</class> <property name="Default Realm">NIFI.APACHE.ORG</property> <property name="Authentication Expiration">12 hours</property> </provider> then, This is a two step process, contains bearer token extraction and using the token posting the requests. step 1(get the token) : kinit <username-pwd or kerberos key_tab_with_princ> token=$(curl -k -X POST --negotiate -u : https://<nifi-hostname>:9091/nifi-api/access/kerberos) Step 2(get the data): curl -k -X GET 'https://<nifi-hostname>:9091/nifi-api/flow/status' -H 'Authorization: Bearer $token' --compressed

bkosaraju · ‎09-11-2017

Hi @Sanaz Janbakhsh, did you manage to install Ambari-Infra or external Solr, unless you store the logs into solr you will not be able to see the adit log information into the tab. once you install the Ambari-Infra go to Ranger --> Configs --> Ranger Audits --> turn on AudittoSolr and SolrCloud (attached the figure here ) then restart the Ranger.

bkosaraju · ‎09-08-2017

This article will help to configure the ranger audit logs to be written into flat file system. some users don't want to use solr to reduce the hardware and software footprints, such cases it will help to write and debug, at the same time this can coexists with solr NiFi is log consolidation done by Logback, hence we need to make the following changes to logback configuration. To enable the ranger audits : in Advanced-nifi-ranger-audit section make the flowing parameters values to, xasecure.audit.destination.log4j=true xasecure.audit.destination.log4j.logger=ranger.audit To capture the logs generated by the logger, configure the logback (same as nifi-app module logger). In Advanced nifi-node-logback-env at add the following content logback.xml template <appender name="RANGER_AUDIT" class="ch.qos.logback.core.rolling.RollingFileAppender"> <file>${org.apache.nifi.bootstrap.config.log.dir}/ranger_nifi_audit.log</file> <rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy"> <fileNamePattern>${org.apache.nifi.bootstrap.config.log.dir}/ranger_nifi_audit_%d{yyyy-MM-dd_HH}.%i.log</fileNamePattern> <maxFileSize>100MB</maxFileSize> <maxHistory>30</maxHistory> </rollingPolicy> <immediateFlush>true</immediateFlush> <encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder"> <pattern>%date %level [%thread] %logger{40} %msg%n</pattern> </encoder> </appender> <logger name="ranger.audit" level="INFO" additivity="false"> <appender-ref ref="RANGER_AUDIT"/> </logger> sample output: sample output: [centos@projecthdfm1 nifi]$ cat ranger_nifi_audit.log 2017-09-08 03:37:47,475 INFO [org.apache.ranger.audit.queue.AuditBatchQueue1] ranger.audit {"repoType":10,"repo":"hdf_clstr_nifi","reqUser":"aaaaaaaa","evtTime":"2017-09-08 03:37:46.699","access":"READ","resource":"/flow","resType":"nifi-resource","action":"READ","result":1,"policy":1,"enforcer":"ranger-acl","cliIP":"999.999.999.999","agentHost":"aaaaaa.bbbbb.example.com","logType":"RangerAudit","id":"0efc4a0d-f634-42c0-9616-5d8298a92892-0","seq_num":1,"event_count":1,"event_dur_ms":0,"tags":[]} 2017-09-08 03:38:41,443 INFO [org.apache.ranger.audit.queue.AuditBatchQueue1] ranger.audit {"repoType":10,"repo":"hdf_clstr_nifi","reqUser":"admin","evtTime":"2017-09-08 03:38:39.121","access":"READ","resource":"/flow","resType":"nifi-resource","action":"READ","result":1,"policy":1,"enforcer":"ranger-acl","cliIP":"999.999.999.999","agentHost":"aaaaa.bbbbb.example.com","logType":"RangerAudit","id":"0efc4a0d-f634-42c0-9616-5d8298a92892-1","seq_num":3,"event_count":1,"event_dur_ms":0,"tags":[]} 2017-09-08 03:49:26,549 INFO [org.apache.ranger.audit.queue.AuditBatchQueue1] ranger.audit {"repoType":10,"repo":"hdf_clstr_nifi","reqUser":"someotheruser","evtTime":"2017-09-08 03:49:25.942","access":"READ","resource":"/flow","resType":"nifi-resource","action":"READ","result":0,"policy":-1,"enforcer":"ranger-acl","cliIP":"999.999.999.999","agentHost":"xxxxx.yyyy.example.com","logType":"RangerAudit","id":"0efc4a0d-f634-42c0-9616-5d8298a92892-2","seq_num":5,"event_count":1,"event_dur_ms":0,"tags":[]} *host names and IP address masked

bkosaraju · ‎09-08-2017

hi @G Sankar To make the current flow working, you can remove the AttributetoJSON processor now the flow would be QueryDatabaseTable--> ConvertAvrotoJSON-->EvaluateJSONPath-->ReplaceText-->PutFile EvaluteJSONPath will make sure that the results of those expressions are assigned to FlowFile Attributes, now we can replace the entire flow file content with attributes. the parameters of the Replace text would be Search Value : (^.*$) replace values : ${col1},${col2},${col3}.. etc col1,2,3 will be attributes of the flow file which extracted by evaluteJSONPath(you may see the flow file attributes in provenence ). on the other note latest version of the NiFi offer the Convert Record processor where you can convert straight from Avro to CSV using controller service.

bkosaraju · ‎08-23-2017

I have done simple test and able to complete in few seconds with your code and its wise to split in multiple pass. #!/bin/bash tgetfl=/tmp/hvdir_$(date +%s) for i in {1..125} do dirs="" for j in {1..8000}; do dirs="$dirs /dirtst/d$i.$j" done #echo "$dirs" echo dfs -mkdir $dirs done > $tgetfl date hive -f $tgetfl date

bkosaraju · ‎08-22-2017

Hi @pbarna, you may use pig or grunt shell or hive CLI and pass all the directories at one shot which does much quicker.

bkosaraju · ‎08-22-2017

hi @nbalaji-elangovan, can you please check for ports 88,749 are able to listen, looks your routing rules(for v-box) or hosts entries(for Mac) need some correction.

bkosaraju · ‎08-07-2017

Hi @D Mortimer, I presume the problem is converting to RDD and processing as chunks, instead of that, I could think of by porting the custom function into Register UDF and apply the logic on the data(frame) which you have retrieved into t , and not to persist the data into memory, so that the UDF get applied across multiple executors while streaming the data and writes back to the table ( shuffles if needed).

Online	Offline
Last Visited	‎04-09-2019 11:41 AM

Member Since	‎01-03-2017 05:05 AM
Last Visited	‎04-09-2019 11:41 AM
Posts	181
Kudos received	44

Cloudera Community

Re: Api to help pull yarn metrics and RM metrics

Re: NiFi Cluster Setup

Re: Hive LLAP ranger insert issue (requires defaul...

Re: Ranger Audit Log (Add filter)

Re: HDFS is not rebalancing after adding new DataN...

Re: Unable to access Solr Admin UI after installat...

Re: Use REST API to access a secured NiFi cluster

Re: Use REST API to access a secured NiFi cluster

Re: Ranger Audit tab doesn't show any data for Nif...

Writing out NiFi Ranger Audit Log (to file system...

Re: Need Help to generate CSV file from JSON

Re: What is the fastest way to create large number...

Re: What is the fastest way to create large number...

Re: Unable to Kinit on local mac with MIT KDC runn...

Re: Looping through chunks of massive Hive data, t...