Member since
12-03-2016
91
Posts
27
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
12396 | 08-27-2019 10:45 AM | |
3521 | 12-24-2018 01:08 PM | |
12574 | 09-16-2018 06:45 PM | |
2774 | 12-12-2016 01:44 AM |
07-27-2020
11:04 PM
@mqadri Thank you for sharing very useful information. I tried configuring Hive LLAP daemons with YARN Node labelling in HDP3 by following your post. But LLAP daemons not running on YARN node label, Instead LLAP daemons running on another nodes which are not part of Node label..(there was no errors in RM and HiveInteractive server) Is HDP3 has different Node label setup for LLAP daemons? Could you please guide me? Thanks in advance!
... View more
07-24-2020
08:09 AM
May be it's because of a diferente version (I'm using HDP and Hadoop 3) but this doesn't work as described here. In first place if you try to set a variable in the "hiveConf:" namespace you will get an error like this: Error while processing statement: Cannot modify hiveConf:test at runtime You have to use the "hivevar:" namespace for this like: :2> set hivevar:test=date_sub(current_date, 5); But more importante, Hive won't expand the variable value definition as shown here: :2> set hivevar:test; +-----------------------------------------+ | hivevar:test=date_sub(current_date, 5) | +-----------------------------------------+ So the INSERT will not interpreted as you stated but instead as: INSERT INTO TABLE am_temp2 PARTITION (search_date=date_sub(current_date, 5)) and this for some reason is not supported in Hive and gives a compilation error: FAILED: ParseException line 1:50 cannot recognize input near 'date_sub' '(' It would be very useful to insert data into static partitioning using pre-calculated variable values like this, from functions or select queries, but I still haven't found how to do this in HiveQL. As a reference, this seems to be (at least partially) related with this: https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/hive-overview/content/hive_use_variables.html
... View more
02-17-2020
06:29 PM
RULE:[1:$1@$0](^.*@AD\.HORTONWORKS\.COM$)s/^(.*)@AD\.HORTONWORKS\.COM$/$1/g I am not quiet sure if the brackets arountd (.*) are really necessary.
... View more
10-10-2019
03:35 AM
This is really a nice article. Kudos to you.
... View more
10-02-2019
06:18 AM
@lvazquez maybe you can directly execute a "kinit" to submit your user's credentials to your LDAP I manage to authenticate users from AD while the cluster is kerberorized through a FreeIPA Server. This is a command sample: %sh
echo "password" | kinit foo@hortonworks.local
hdfs dfs -ls /
Found 12 items
drwxrwxrwt - yarn hadoop 0 2019-10-02 13:53 /app-logs
drwxr-xr-x - hdfs hdfs 0 2019-10-01 15:27 /apps
drwxr-xr-x - yarn hadoop 0 2019-10-01 14:06 /ats
drwxr-xr-x - hdfs hdfs 0 2019-10-01 14:08 /atsv2
drwxr-xr-x - hdfs hdfs 0 2019-10-01 14:06 /hdp
drwx------ - livy hdfs 0 2019-10-02 11:35 /livy2-recovery
drwxr-xr-x - mapred hdfs 0 2019-10-01 14:06 /mapred
drwxrwxrwx - mapred hadoop 0 2019-10-01 14:08 /mr-history
drwxrwxrwx - spark hadoop 0 2019-10-02 15:08 /spark2-history
drwxrwxrwx - hdfs hdfs 0 2019-10-01 15:31 /tmp
drwxr-xr-x - hdfs hdfs 0 2019-10-02 14:23 /user
drwxr-xr-x - hdfs hdfs 0 2019-10-01 15:14 /warehouse I think this way is really ugly but at least, it is possible. Do not forget to change in your hdfs-site file the auth_to_local RULE:[1:$1@$0](.*@HORTONWORKS.LOCAL)s/@.*//
RULE:[1:$1@$0](.*@IPA.HORTONWORKS.LOCAL)s/@.*//
... View more
02-26-2019
02:12 PM
1 Kudo
Hi! We use hive with llap, so "run as end user" = false. Impersonalization enabled for livy interpeter. We also use Ranger to manage permissions. Services / Spark2 / Configs Custom livy2-conf livy.file.local-dir-whitelist = /usr/hdp/current/hive_warehouse_connector/
livy.spark.security.credentials.hiveserver2.enabled = true
livy.spark.sql.hive.hiveserver2.jdbc.url = jdbc:hive2://dwh-test-hdp-master03.COMPANY.ru:10000/
livy.spark.sql.hive.hiveserver2.jdbc.url.principal = hive/_HOST@COMPANY.RU
livy.spark.yarn.security.credentials.hiveserver2.enabled = true
livy.superusers = zeppelin-dwh_test Custom spark2-defaults spark.datasource.hive.warehouse.load.staging.dir = /tmp
spark.datasource.hive.warehouse.metastoreUri = thrift://dwh-test-hdp-master03.COMPANY.ru:9083
spark.hadoop.hive.llap.daemon.service.hosts = @llap0
spark.hadoop.hive.zookeeper.quorum = dwh-test-hdp-master01.COMPANY.ru:2181,dwh-test-hdp-master02.COMPANY.ru:2181,dwh-test-hdp-master03.COMPANY.ru:2181
spark.history.ui.admin.acls = knox
spark.security.credentials.hive.enabled = true
spark.security.credentials.hiveserver2.enabled = true
spark.sql.hive.hiveserver2.jdbc.url = jdbc:hive2://dwh-test-hdp-master03.COMPANY.ru:10000/
spark.sql.hive.hiveserver2.jdbc.url.principal = hive/_HOST@COMPANY.RU
spark.sql.hive.llap = true
spark.yarn.security.credentials.hiveserver2.enabled = true Custom spark2-hive-site-override hive.llap.daemon.service.hosts = @llap0 / Services / HDFS / Configs You may also set this these values to asterisk for test if problem in delegation. Custom core-site hadoop.proxyuser.hive.groups *
hadoop.proxyuser.hive.hosts *
hadoop.proxyuser.livy.groups *
hadoop.proxyuser.livy.hosts *
hadoop.proxyuser.zeppelin.hosts *
hadoop.proxyuser.zeppelin.groups * Zeppelin livy2 %livy2 Interpreter Properties name value livy.spark.hadoop.hive.llap.daemon.service.hosts @llap0
livy.spark.jars file:/usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.1.0.0-78.jar
livy.spark.security.credentials.hiveserver2.enabled true
livy.spark.sql.hive.hiveserver2.jdbc.url jdbc:hive2://dwh-test-hdp-master03.COMPANY.ru:10000/
livy.spark.sql.hive.hiveserver2.jdbc.url.principal hive/_HOST@COMPANY.RU
livy.spark.sql.hive.llap true
livy.spark.submit.pyFiles file:/usr/hdp/current/hive_warehouse_connector/pyspark_hwc-1.0.0.3.1.0.0-78.zip
livy.spark.yarn.security.credentials.hiveserver2.enabled true
livy.superusers livy,zeppelin
spark.security.credentials.hiveserver2.enabled true
spark.sql.hive.hiveserver2.jdbc.url.principal hive/_HOST@COMPANY.RU
zeppelin.livy.concurrentSQL false
zeppelin.livy.displayAppInfo true
zeppelin.livy.keytab /etc/security/keytabs/zeppelin.server.kerberos.keytab
zeppelin.livy.maxLogLines 1000
zeppelin.livy.principal zeppelin-dwh_test@COMPANY.RU
zeppelin.livy.pull_status.interval.millis 1000
zeppelin.livy.restart_dead_session false
zeppelin.livy.session.create_timeout 120
zeppelin.livy.spark.sql.field.truncate true
zeppelin.livy.spark.sql.maxResult 1000
zeppelin.livy.url http://dwh-test-hdp-master02.COMPANY.ru:8999 Sample code for test: %livy2
import com.hortonworks.hwc.HiveWarehouseSession
import com.hortonworks.hwc.HiveWarehouseSession._
val hive = HiveWarehouseSession.session(spark).build()
hive.showDatabases().show(100) Ranger audit example:
... View more
12-24-2018
01:44 PM
WARNING: when doing the previous changes to the installed service.xml and rewrite.xml (under data/services/...) DO NOT create a backup copy (ex. rewrite.xml.orig) or move the original version to a "backup" sub-folder under this path!! Knox will load ALL the xml files it finds under "/usr/hdp/current/knox-server/data/services" (weird but real!) and this will trigger many strange and confusing behaviors!! I've wasted a few hours trying to make the above work, just because of this 😞
... View more
08-06-2018
02:50 AM
1 Kudo
After some tests and reading the official Hive documentation I'm answering this by myself. Both sources are incomplete and confuse and I guess it's because they mix the required configuration for Hive 0.13.x and for Hive after 0.14 (what is used in HDP 2.5.x and above). After changing authorization to SQLStdAuth and setting "Run as end user instead of Hive user" (hive.server2.enable.doAs) to false you have to In Custom hive-site: add the user you want to use as Hive administrator, for example admin to the default list o users with admin role: hive.users.in.admin.role = hive,hue,admin In hive-site.xml, corresponding to the General and Advanced hive-site sections on Ambari: check you have the following settings: # General section: hive.security.authorization.enabled=true
hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory # Need to add the second class to the comma separated list
hive.security.metastore.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider,org.apache.hadoop.hive.ql.security.authorization.MetaStoreAuthzAPIAuthorizerEmbedOnly # Advanced hive-site section: hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator In hiveserver2-site.xml corresponding to Advanced hiveserver2-site in Ambari: hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator
hive.security.authorization.enabled=true
hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory
Note the class used as "authorization.manager" in hive-site and in hiveserve2-site have similar names but are different, the first one is "SQLStdConfOnlyAuthorizerFactory" and the second "SQLStdHiveAuthorizerFactory". Ambari will guide you with some of these settings once you select SQLStdAuth authorization, but this is the complete picture of what is needed. For further reference check: https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization
... View more
01-18-2018
03:02 AM
Just in case this may save time to other people. The configuration included with HDP 2.5.x and Ambari 2.5 is not compatible with using Ranger Tagsync with SSL, so there is no "Advanced ranger-tagsync-policymgr-ssl" section or anything like that on Ranger (0.6.0) configuration from Ambari. The first response above refers to the parameters included in the file ranger-tagsync-policymgr-ssl.xml included with Ambari 2.6 (y believe in HDP 2.6.x). This in included in the patch discussed in the following URL: https://issues.apache.org/jira/browse/AMBARI-18874 There is an Ambari patch for HDP 2.5 but I was not able to make it work with Ambari 2.5 and Ranger 0.6.0 (included with HDP 2.5.6) so the way to make it work was to change include the file /etc/ranger/tagsync/conf/ranger-tagsync-policymgr-ssl.xml from the patch above edited by hand and in the section "Advanced ranger-tagsync-site" modify the parameter ranger.tagsync.dest.ranger.ssl.config.filename
which incredibly (and shamefully) points to a keystore!!! in the default HDP 2.5 configuration to point to this file like this: ranger.tagsync.dest.ranger.ssl.config.filename=/etc/ranger/tagsync/conf/ranger-tagsync-policymgr-ssl.xml After this you will also need to change the credentials store file at rangertagsync.jceks to include the keys ssltruststore and sslkeystore with the correct values. There are other articles on how to do this. Hopefully in HDP 2.6 things are going to be easier 😞
... View more
12-09-2017
08:59 PM
1 Kudo
Finally after writing out the problem to post the question I was able to find the problem for myself and I will describe it in case it may happen to someone else. The problem was there are a couple of extra properties not documented here: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.6/bk_security/content/configure_ambari_ranger_ssl_public_ca_certs_admin.html used to define the truststore to be used for the Ranger Admin service when connecting to other services. These are located at the of section "Advanced ranger-admin-site" as show here and should be changed to point to your system truststore (including the CA certificate used to sign the Hadoop Services certificates). Ranger -> Advanced ranger-admin-site So in order to make HTTPS and SSL work for Ranger Admin and Ranger Plugins in both directions you have to set correctly all the following fields pointing to the proper keystore (including private key) o truststore (including signing CA or certificate of the service you are going to connect): In Ranger -> Advanced ranger-admin-site
ranger.https.attrib.keystore.file = /etc/security/serverKeys/keystore.jks
ranger.service.https.attrib.keystore.pass = ******
... Other ranger.service.https.* releated properties
// Not documented in Security manual
ranger.truststore.file = /etc/security/serverKeys/truststore.jks
ranger.truststore.password = ******* In Ranger -> Advanced ranger-admin-site (this seems to be the same property above, so I suspect these are probably from different software version and only one is necessary but both are mentioned in the documentation so who knows?) ranger.service.https.attrib.keystore.file = /etc/security/serverKeys/keystore.jks In Service (HDFS/YARn) -> Advanced ranger-hdfs-policymgr-ssl (also set properties in Advanced ranger-hdfs-plugin-properties to match the certificate common name) // Keystore with the client certificate cn=hadoopclient,... xasecure.policymgr.clientssl.keystore = /etc/security/clientKeys/hadoopclient.jks
...
xasecure.policymgr.clientssl.truststore = /etc/security/clientKeys/truststore.jks
...
... View more