Member since
11-12-2018
182
Posts
175
Kudos Received
29
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1430 | 08-05-2022 10:44 PM | |
1304 | 07-30-2022 04:37 PM | |
3024 | 07-29-2022 07:50 PM | |
1478 | 07-29-2022 06:51 PM | |
606 | 07-09-2022 04:37 PM |
04-19-2023
02:38 PM
@skasireddy Can you please make sure you have copied hbase-site.xml from the remote HBase cluster to /etc/spark/conf/yarn-conf/ or /etc/spark/conf/ on the Edge node from where you are trying to connect your spark application?
... View more
08-05-2022
10:44 PM
Steps for creating jdbc hive interpreter: As the keytab location is not consistent in CDP and changes as services are restarted, we should copy keytab to a consistent location. As we use proxyuser option with hive beeline, zeppelin user must be configured to allow impersonate of another user. Follow the below config steps to create a JDBC interpreter in Zeppelin. - Copy keytab from the current process directory : # cp $(ls -1drt /var/run/cloudera-scm-agent/process/*-ZEPPELIN_SERVER | tail -1)/zeppelin.keytab /var/tmp # chown zeppelin:zeppelin /var/tmp/zeppelin.keytab - Configure core-site to allow proxyuser for zeppelin CM UI > HDFS > Configurations > Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml hadoop.proxyuser.zeppelin.hosts=* hadoop.proxyuser.zeppelin.groups=* - Restart required services(must restart Hadoop and hive_on_tez service) - Configure interpreter in zeppelin with below additional properties : hive.driver org.apache.hive.jdbc.HiveDriver hive.proxy.user.property hive.server2.proxy.user hive.url jdbc:hive2://xxxxxxxxxxxx.com:2181/default;principal=hive/_HOST@XXXXXX.XXXXX.COM;serviceDiscoveryMode=zooKeeper;ssl=true;zooKeeperNamespace=hiveserver2 hive.user hive zeppelin.jdbc.keytab.location /var/tmp/zeppelin.keytab zeppelin.jdbc.principal zeppelin/xxxxxxxxxxxxx.com@XXXXXX.XXXXX.COM - Make sure hive.url/keytab/principal configs are set as per the environment. - Create notebook/paragraph and verify user impersonation and hive access %jdbc(hive) select current_user()
... View more
08-02-2022
06:44 PM
Hi @paulo_klein Apache Zeppelin on Cloudera Data Platform supports the following interpreters: JDBC (supports Hive, Phoenix) OS Shell Markdown Livy (supports Spark, Spark SQL, PySpark, PySpark3, and SparkR) AngularJS As you would like to create Hive tables using Zeppelin, you can use the JDBC interpreter to access Hive. The %jdbc the interpreter supports access to Apache Hive data. The interpreter connects to Hive via Thrift. For more details, you can refer to this documentation which describes how to use the Apache Zeppelin JDBC interpreter to access Apache Hive.
... View more
08-02-2022
06:34 PM
@Asim- JDBC also you need HWC for Managed tables. Here is the example for Spark2, but as mentioned earlier Spark3 we don't have any other way to connect Hive ACID tables from Apache Spark other than HWC and it is not yet a supported feature for Spark3.2 / CDS 3.2 in CDP 7.1.7. Marking this thread close, if you have any issues related to external tables kindly start a new Support-Questions thread for better tracking of the issue and documentation. Thanks
... View more
07-30-2022
04:37 PM
1 Kudo
Hi @paulo_klein, you can refer to the documentation for Configuring TLS/SSL encryption manually for Zeppelin. Yes, you can move the Zeppelin server to another host but make sure you have an SSL certificate configured correctly, the below documentation might helpful for further troubleshooting. https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/security-how-to-guides/topics/cm-security-how-to-obtain-server-certs-tls.html https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/security-how-to-guides/topics/cm-security-how-to-renew-certs-tls.html
... View more
07-30-2022
09:59 AM
@Asim- Run CREATE, UPDATE, DELETE, INSERT, and MERGE statements in this way: hive.executeUpdate("INSERT INTO table_name (column1, column2,...) VALUES (value1, value2,...)") For more details, you can refer to HWC Read and write operations documentation. Other than HWC, we don't have any other way to connect Hive ACID tables from Apache Spark, as mentioned early we are expecting this feature will be released in our upcoming CDS 3.3 release.
... View more
07-29-2022
07:50 PM
Hi @Asim- Hive Warehouse Connector (HWC) securely accesses Hive-managed (ACID Tables) from Spark. You need to use HWC software to query Apache Hive-managed tables from Apache Spark. As of now, HWC supports Spark2 in CDP 7.1.7. HWC is not yet a supported feature for Spark3.2 / CDS 3.2 in CDP 7.1.7. We are expecting HWC for Spark3 to be included in our upcoming CDS 3.3 in CDP 7.1.8.
... View more
07-29-2022
07:08 PM
Hi @paulo_klein, To access the Livy, Zeppelin requires its own SSL certificate imported in the trustore to be configured. You can refer to the documentation, Livy Interpreter for Apache Zeppelin SSL configuration
... View more
07-29-2022
06:51 PM
1 Kudo
Hi @paulo_klein, Apache Spark by default request a delegation token for all 4 services: HDFS, YARN, Hive, and HBase. It is printed as a WARN message but it does no harm. It is captured due to the fact that no HBase jars are in the Spark classpath, hence, it's unable to get the HBase DelegationToken. To fix this issue you start spark-shell or pyspark or spark-submit via --conf spark.security.credentials.hbase.enabled=false Example: # spark-shell --conf spark.security.credentials.hbase.enabled=false
... View more