Created on 03-22-2017 02:35 AM - edited 09-16-2022 04:18 AM
When using Zeppelin with Livy on a kerberized CDH 5.10 cluster and trying to access Hive, I ran into this
https://issues.apache.org/jira/browse/SPARK-13478
Since Hive on Spark is not supported on Spark 2.0 in CDH 5.10, only on Spark 1.6, when will the fix be backported please?
The fix is available for Spark 1.6.4, so one option would be to upgrade Hive on Spark to support Spark 1.6.4. What is your timeline to do this?
Also, when will you provide packaging for LIvy?
thank you
Created 03-23-2017 05:34 AM
Solved this.
Hive from Zeppelin w. Livy & Spark2
Add spark2 to the kerberized cluster
https://www.cloudera.com/downloads/spark2/2-0.html
Adapt livy to recognize spark2 if required
https://community.cloudera.com/t5/Web-UI-Hue-Beeswax/Spark-2-0-livy-server-3/td-p/48562
#build livy for spark2 mvn clean package -DskipTests -Dspark-2.0 -Dscala-2.11 #make livy hive aware sudo ln -s /etc/hive/conf/hive-site.xml /etc/spark/conf/hive-site.xml #make a livy user sudo useradd livy #create a logs dir mkdir logs #change the ownership of everything sudo chown -R livy:livy ./* #start livy with the right config (see below) sudo -u livy /opt/livy/bin/livy-server livy-env.sh =============================================== export SPARK_HOME=/opt/cloudera/parcels/SPARK2/lib/spark2 export HADOOP_HOME=/opt/cloudera/parcels/CDH export SPARK_CONF_DIR=/etc/spark/conf export HADOOP_CONF_DIR=/etc/hadoop/conf livy.conf =============================================== # What spark master Livy sessions should use. livy.spark.master = yarn # What spark deploy mode Livy sessions should use. livy.spark.deployMode = cluster # If livy should impersonate the requesting users when creating a new session. livy.impersonation.enabled = true # Whether to enable HiveContext in livy interpreter, if it is true hive-site.xml will be detected # on user request and then livy server classpath automatically. livy.repl.enableHiveContext = true livy.server.launch.kerberos.keytab = /opt/livy/livy.keytab livy.server.launch.kerberos.principal=livy/server.fqdn@XXX livy.impersonation.enabled = true livy.server.auth.type = kerberos livy.server.auth.kerberos.keytab=/opt/livy/spnego.keytab livy.server.auth.kerberos.principal=HTTP/server.fqdn@XXX livy.server.access_control.enabled=true livy.server.access_control.users=zeppelin,livy livy.superusers=zeppelin,livy
You must also configure Zeppelin for kerberos auth:
zeppelin.livy.keytab /opt/zeppelin/zeppelin.keytab zeppelin.livy.principal zeppelin@XXX zeppelin.livy.url http://host.fqdn:8998
Finally you must configure shiro, I used an LDAP backend with org.apache.zeppelin.realm.LdapGroupRealm plugin to enable LDAP user group awareness (take care to set up groupOfNames not posixGroup...).
Created 03-23-2017 05:34 AM
Solved this.
Hive from Zeppelin w. Livy & Spark2
Add spark2 to the kerberized cluster
https://www.cloudera.com/downloads/spark2/2-0.html
Adapt livy to recognize spark2 if required
https://community.cloudera.com/t5/Web-UI-Hue-Beeswax/Spark-2-0-livy-server-3/td-p/48562
#build livy for spark2 mvn clean package -DskipTests -Dspark-2.0 -Dscala-2.11 #make livy hive aware sudo ln -s /etc/hive/conf/hive-site.xml /etc/spark/conf/hive-site.xml #make a livy user sudo useradd livy #create a logs dir mkdir logs #change the ownership of everything sudo chown -R livy:livy ./* #start livy with the right config (see below) sudo -u livy /opt/livy/bin/livy-server livy-env.sh =============================================== export SPARK_HOME=/opt/cloudera/parcels/SPARK2/lib/spark2 export HADOOP_HOME=/opt/cloudera/parcels/CDH export SPARK_CONF_DIR=/etc/spark/conf export HADOOP_CONF_DIR=/etc/hadoop/conf livy.conf =============================================== # What spark master Livy sessions should use. livy.spark.master = yarn # What spark deploy mode Livy sessions should use. livy.spark.deployMode = cluster # If livy should impersonate the requesting users when creating a new session. livy.impersonation.enabled = true # Whether to enable HiveContext in livy interpreter, if it is true hive-site.xml will be detected # on user request and then livy server classpath automatically. livy.repl.enableHiveContext = true livy.server.launch.kerberos.keytab = /opt/livy/livy.keytab livy.server.launch.kerberos.principal=livy/server.fqdn@XXX livy.impersonation.enabled = true livy.server.auth.type = kerberos livy.server.auth.kerberos.keytab=/opt/livy/spnego.keytab livy.server.auth.kerberos.principal=HTTP/server.fqdn@XXX livy.server.access_control.enabled=true livy.server.access_control.users=zeppelin,livy livy.superusers=zeppelin,livy
You must also configure Zeppelin for kerberos auth:
zeppelin.livy.keytab /opt/zeppelin/zeppelin.keytab zeppelin.livy.principal zeppelin@XXX zeppelin.livy.url http://host.fqdn:8998
Finally you must configure shiro, I used an LDAP backend with org.apache.zeppelin.realm.LdapGroupRealm plugin to enable LDAP user group awareness (take care to set up groupOfNames not posixGroup...).
Created 09-12-2017 03:31 PM
Hi,
Something to be careful is when you do "Deploy Client Configuration" on your Spark2 service it will remove the link or the hive-site.xml if you have copied it.
I have noticed all these config are in $SPARK_CONF_DIR/yarn-conf/ so I wish Livy could also load them when it starts up the Spark.
Created 09-12-2017 03:49 PM
Created 09-14-2017 12:42 PM