Created 08-01-2022 03:28 AM
Hello,
I would like to create Hive tables using Zeppeling and I found the the below document that use HWC:
I'm not sure what means "livy configuration file"?
Is "%livy2" a new Interpreter? May I create a new Interpreter with only indicated configurations?
Thanks!
Created on 08-05-2022 10:44 PM - edited 08-05-2022 10:45 PM
Steps for creating jdbc hive interpreter: As the keytab location is not consistent in CDP and changes as services are restarted, we should copy keytab to a consistent location. As we use proxyuser option with hive beeline, zeppelin user must be configured to allow impersonate of another user. Follow the below config steps to create a JDBC interpreter in Zeppelin.
- Copy keytab from the current process directory :
# cp $(ls -1drt /var/run/cloudera-scm-agent/process/*-ZEPPELIN_SERVER | tail -1)/zeppelin.keytab /var/tmp
# chown zeppelin:zeppelin /var/tmp/zeppelin.keytab
- Configure core-site to allow proxyuser for zeppelin
CM UI > HDFS > Configurations > Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml
hadoop.proxyuser.zeppelin.hosts=*
hadoop.proxyuser.zeppelin.groups=*
- Restart required services(must restart Hadoop and hive_on_tez service)
- Configure interpreter in zeppelin with below additional properties :
hive.driver org.apache.hive.jdbc.HiveDriver
hive.proxy.user.property hive.server2.proxy.user
hive.url jdbc:hive2://xxxxxxxxxxxx.com:2181/default;principal=hive/_HOST@XXXXXX.XXXXX.COM;serviceDiscoveryMode=zooKeeper;ssl=true;zooKeeperNamespace=hiveserver2
hive.user hive
zeppelin.jdbc.keytab.location /var/tmp/zeppelin.keytab
zeppelin.jdbc.principal zeppelin/xxxxxxxxxxxxx.com@XXXXXX.XXXXX.COM
- Make sure hive.url/keytab/principal configs are set as per the environment.
- Create notebook/paragraph and verify user impersonation and hive access
%jdbc(hive)
select current_user()
Created 08-01-2022 04:18 PM
Any update?
Created 08-02-2022 06:44 PM
Hi @paulo_klein Apache Zeppelin on Cloudera Data Platform supports the following interpreters:
JDBC (supports Hive, Phoenix)
OS Shell
Markdown
Livy (supports Spark, Spark SQL, PySpark, PySpark3, and SparkR)
AngularJS
As you would like to create Hive tables using Zeppelin, you can use the JDBC interpreter to access Hive.
The %jdbc the interpreter supports access to Apache Hive data. The interpreter connects to Hive via Thrift. For more details, you can refer to this documentation which describes how to use the Apache Zeppelin JDBC interpreter to access Apache Hive.
Created 08-03-2022 05:11 AM
Hello @jagadeesan ,
Thanks for your reply.
On my fresh new CDP 7.1.7 cluster I havn't the interpreter %jdbc. I only have %livy, %angular and %md intepreters installed.
Can I create new one? Could you share the correct way to create the %jdbc interpreter on CDP?
I would like to load into the hive a spark dataframe. It's will be possible with this %jdbc interpreter?
Tks!
Created on 08-05-2022 10:44 PM - edited 08-05-2022 10:45 PM
Steps for creating jdbc hive interpreter: As the keytab location is not consistent in CDP and changes as services are restarted, we should copy keytab to a consistent location. As we use proxyuser option with hive beeline, zeppelin user must be configured to allow impersonate of another user. Follow the below config steps to create a JDBC interpreter in Zeppelin.
- Copy keytab from the current process directory :
# cp $(ls -1drt /var/run/cloudera-scm-agent/process/*-ZEPPELIN_SERVER | tail -1)/zeppelin.keytab /var/tmp
# chown zeppelin:zeppelin /var/tmp/zeppelin.keytab
- Configure core-site to allow proxyuser for zeppelin
CM UI > HDFS > Configurations > Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml
hadoop.proxyuser.zeppelin.hosts=*
hadoop.proxyuser.zeppelin.groups=*
- Restart required services(must restart Hadoop and hive_on_tez service)
- Configure interpreter in zeppelin with below additional properties :
hive.driver org.apache.hive.jdbc.HiveDriver
hive.proxy.user.property hive.server2.proxy.user
hive.url jdbc:hive2://xxxxxxxxxxxx.com:2181/default;principal=hive/_HOST@XXXXXX.XXXXX.COM;serviceDiscoveryMode=zooKeeper;ssl=true;zooKeeperNamespace=hiveserver2
hive.user hive
zeppelin.jdbc.keytab.location /var/tmp/zeppelin.keytab
zeppelin.jdbc.principal zeppelin/xxxxxxxxxxxxx.com@XXXXXX.XXXXX.COM
- Make sure hive.url/keytab/principal configs are set as per the environment.
- Create notebook/paragraph and verify user impersonation and hive access
%jdbc(hive)
select current_user()
Created 08-09-2022 05:23 AM
Hello @jagadeesan ,
We did all the configurations, but now we are geting the error below.
May I create some entry for trustStore?
%jdbc2(hive)
select current_user()
Error: Also, could not send response: org.apache.thrift.transport.TTransportException: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Created 08-09-2022 05:36 AM
My currently interpreter configuration:
jdbc2 %jdbc2
Properties
name value
common.max_count | 1000 |
default.completer.schemaFilters | |
default.completer.ttlInSeconds | 120 |
default.driver | org.postgresql.Driver |
default.password | |
default.precode | |
default.splitQueries | false |
default.statementPrecode | |
default.url | jdbc:postgresql://localhost:5432/ |
default.user | gpadmin |
hive.driver | org.apache.hive.jdbc.HiveDriver |
hive.url | jdbc:hive2://SERVER01.LOCAL.NET:2181,SERVER02.LOCAL.NET:2181,SERVER03.LOCAL.NET:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2 |
hive.user | hive |
zeppelin.jdbc.auth.type | |
zeppelin.jdbc.concurrent.max_connection | 10 |
zeppelin.jdbc.concurrent.use | true |
zeppelin.jdbc.interpolation | false |
zeppelin.jdbc.keytab.location | /var/tmp/zeppelin.keytab |
zeppelin.jdbc.maxConnLifetime | -1 |
zeppelin.jdbc.maxRows | 1000 |
zeppelin.jdbc.principal | zeppelin/SERVER01.LOCAL.NET@LOCAL.NET |
Created 08-12-2022 04:49 AM
To solve "unable to find valid certification path to requested target" I just import the certificate to java and restart the Zeppeling Server.
### LINUX LIST CERT
cd /usr/lib/jvm/java-11-openjdk-11.0.15.0.10-2.el8_6.x86_64/bin
./keytool -list -keystore /usr/lib/jvm/java-11-openjdk-11.0.15.0.10-2.el8_6.x86_64/lib/security/cacerts
### LINUX IMPORT CERT
./keytool --import --alias keystore_cloudera --file /var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_cacerts.pem -keystore /usr/lib/jvm/java-11-openjdk-11.0.15.0.10-2.el8_6.x86_64/lib/security/cacerts