Member since
10-20-2015
92
Posts
78
Kudos Received
9
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4091 | 06-25-2018 04:01 PM | |
6897 | 05-09-2018 05:36 PM | |
2424 | 03-16-2018 04:11 PM | |
7571 | 05-18-2017 12:42 PM | |
6323 | 03-28-2017 06:42 PM |
03-11-2020
06:10 AM
Thx, D! It’s works at Ranger v2.0 from new CDP Data Center, BareMetal version! Regards, Caseiro.
... View more
09-20-2019
04:21 AM
Hello @dvillarreal, Thank you for the script files, makes life easy! Just a small edit: The proc.psql file was missing one SQL statement @line 96: delete from x_user_module_perm where user_id = x_portal_user_id; We were facing the below issue while trying the commands from psql file manually: ERROR: update or delete on table "x_portal_user" violates foreign key constraint "x_user_module_perm_fk_userid" on table "x_user_module_perm"
DETAIL: Key (id)=(19) is still referenced from table "x_user_module_perm" Because of the above blocker, the script was not deleting the users from the DB. Once we added the missing SQL statement and ran the deleteRangerUser.sh we were able to use it with successful results.
... View more
05-17-2019
06:50 PM
1 Kudo
Customers have asked me about wanting to review ranger audit archive logs stored on HDFS as the UI only shows the Last 90 days of data using Solr infra. I decided to approach the problem using Zeppelin/Spark for a fun example. 1. Prerequisites - Zeppelin and Spark2 installed on your system. As well as ranger with ranger audit logs being stored in HDFS. Create a policy in ranger for HDFS to allow your zeppelin user to read and execute recursively for /ranger/audit directory. 2. Create your notebook in Zeppelin and create some code like the following example: %spark2.spark
// --Specify service and date if you wish
//val path = "/ranger/audit/hdfs/20190513/*.log"
// --Be brave and map the whole enchilada
val path = "/ranger/audit/*/*/*.log"
// --read in the json and drop any malformed json
val rauditDF = spark.read.option("mode", "DROPMALFORMED").json(path)
// --print the schema to review and show me top 20 lines.
rauditDF.printSchema()
rauditDF.show(20,false)
// --Do some spark sql on the data and look for denials
println("sparksql--------------------")
rauditDF.createOrReplaceTempView(viewName="audit")
var readAccessDF = spark.sql("SELECT reqUser, repo, access, action, evtTime, policy, resource, reason, enforcer, result FROM audit where result='0'").withColumn("new_result", when(col("result") === "1","Allowed").otherwise("Denied"))
readAccessDF.show(20,false) 3. Output should look something like path: String = /ranger/audit/*/*/*.log
rauditDF: org.apache.spark.sql.DataFrame = [access: string, action: string ... 23 more fields]
root
|-- access: string (nullable = true)
|-- action: string (nullable = true)
|-- additional_info: string (nullable = true)
|-- agentHost: string (nullable = true)
|-- cliIP: string (nullable = true)
|-- cliType: string (nullable = true)
|-- cluster_name: string (nullable = true)
|-- enforcer: string (nullable = true)
|-- event_count: long (nullable = true)
|-- event_dur_ms: long (nullable = true)
|-- evtTime: string (nullable = true)
|-- id: string (nullable = true)
|-- logType: string (nullable = true)
|-- policy: long (nullable = true)
|-- reason: string (nullable = true)
|-- repo: string (nullable = true)
|-- repoType: long (nullable = true)
|-- reqData: string (nullable = true)
|-- reqUser: string (nullable = true)
|-- resType: string (nullable = true)
|-- resource: string (nullable = true)
|-- result: long (nullable = true)
|-- seq_num: long (nullable = true)
|-- sess: string (nullable = true)
|-- tags: array (nullable = true)
| |-- element: string (containsNull = true)
sql
readAccessDF: org.apache.spark.sql.DataFrame = [reqUser: string, repo: string ... 9 more fields]
+--------+------------+------------+-------+-----------------------+------+-------------------------------------------------------------------------------------+----------------------------------+----------+------+----------+
|reqUser |repo |access |action |evtTime |policy|resource |reason |enforcer |result|new_result|
+--------+------------+------------+-------+-----------------------+------+-------------------------------------------------------------------------------------+----------------------------------+----------+------+----------+
|dav |c3205_hadoop|READ_EXECUTE|execute|2019-05-13 22:07:23.971|-1 |/ranger/audit/hdfs |/ranger/audit/hdfs |hadoop-acl|0 |Denied |
|zeppelin|c3205_hadoop|READ_EXECUTE|execute|2019-05-13 22:10:47.288|-1 |/ranger/audit/hdfs |/ranger/audit/hdfs |hadoop-acl|0 |Denied |
|dav |c3205_hadoop|EXECUTE |execute|2019-05-13 23:57:49.410|-1 |/ranger/audit/hiveServer2/20190513/hiveServer2_ranger_audit_c3205-node3.hwx.local.log|/ranger/audit/hiveServer2/20190513|hadoop-acl|0 |Denied |
|zeppelin|c3205_hive |USE |_any |2019-05-13 23:42:50.643|-1 |null |null |ranger-acl|0 |Denied |
|zeppelin|c3205_hive |USE |_any |2019-05-13 23:43:08.732|-1 |default |null |ranger-acl|0 |Denied |
|dav |c3205_hive |USE |_any |2019-05-13 23:48:37.603|-1 |null |null |ranger-acl|0 |Denied |
+--------+------------+------------+-------+-----------------------+------+-------------------------------------------------------------------------------------+----------------------------------+----------+------+----------+ 4. You can proceed to run sql as well on the audit view information using sql if you so desire. 5. You may need to fine tune your spark interpreter in zeppelin to meet your needs like SPARK_DRIVER_MEMORY, spark.executor.cores, spark.executor.instances, & spark.executor.memory. It helped to see what was happening by tailing the zeppelin log for spark. tailf zeppelin-interpreter-spark2-spark-zeppelin-cluster1.hwx.log
... View more
Labels:
12-10-2018
07:29 PM
Thank you @Robert Levas @dvillarreal Yes, I am using a newer version of ambari and also tried FreeIPA since openLDAP didn't seem to work art all with kerberos. I followed the exact steps as on https://community.hortonworks.com/articles/59645/ambari-24-kerberos-with-freeipa.html - everything seems to be working fine but fails when kerberizing the cluster. I get the following error: Also, important to note that while I get the following error: DNS query for data2.testhdp.com. A failed: The DNS operation timed out after 30.0005660057 seconds DNS resolution for hostname data2.testhdp.com failed: The DNS operation timed out after 30.0005660057 seconds Failed to update DNS records. Missing A/AAAA record(s) for host data2.testhdp.com: 172.31.6.79. Missing reverse record(s) for address(es): 172.31.6.79.
I installed server as: ipa-server-install --domain=testhdp.com \ --realm=TESTHDP.COM \ --hostname=ldap2.testhdp.com \ --setup-dns \ --forwarder=8.8.8.8 \ --reverse-zone=3.2.1.in-addr.arpa. and the clients on each node as ipa-client-install --domain=testhdp.com \
--server=ldap2.testhdp.com \
--realm=TESTHDP.COM \
--principal=hadoopadmin@TESTHDP.COM\
--enable-dns-updates Also, that post doing the following step: echo "nameserver ldap2.testhdp.com" > /etc/resolv.conf my yum is broken and I need to revert to make it work. Do you guys have any idea about it? I thought that there is no need of DNS as I have resolution of *.testhdp.com in my hostfile on all nodes.
... View more
10-15-2018
03:23 PM
My pleasure! @Jasper
... View more
03-19-2018
01:21 PM
@dvillarreal oops, I have missed that ones. Thanks for pointing me policy change/update traces/audits.
... View more
05-18-2017
12:42 PM
Hi @Qi Wang, Yes, it should be fixed in the next maintenance release. In the meantime, please use the workaround provided. Thanks,
... View more
03-30-2017
05:54 PM
1 Kudo
yes surya it was one way ssl
... View more
03-06-2017
09:11 PM
@badr bakkou This would probably be best answered if you submitted as a new question. Provide the gateway.log & gateway-audit.log outputs, topology, and lastly the configuration string you are using with its associated output. Best regards, David
... View more
06-19-2018
08:14 PM
How to assign privileges to a group when it is created?
... View more