Support Questions

____r · ‎03-22-2017

When using Zeppelin with Livy on a kerberized CDH 5.10 cluster and trying to access Hive, I ran into this

https://issues.apache.org/jira/browse/SPARK-13478

Since Hive on Spark is not supported on Spark 2.0 in CDH 5.10, only on Spark 1.6, when will the fix be backported please?

The fix is available for Spark 1.6.4, so one option would be to upgrade Hive on Spark to support Spark 1.6.4. What is your timeline to do this?

Also, when will you provide packaging for LIvy?

thank you

____r · ‎03-23-2017

Solved this.

Hive from Zeppelin w. Livy & Spark2

Add spark2 to the kerberized cluster

https://www.cloudera.com/downloads/spark2/2-0.html

https://www.cloudera.com/documentation/enterprise/latest/topics/cm_mc_addon_services.html#concept_qb...

Adapt livy to recognize spark2 if required

https://community.cloudera.com/t5/Web-UI-Hue-Beeswax/Spark-2-0-livy-server-3/td-p/48562

#build livy for spark2
mvn clean package -DskipTests -Dspark-2.0 -Dscala-2.11

#make livy hive aware
sudo ln -s /etc/hive/conf/hive-site.xml /etc/spark/conf/hive-site.xml

#make a livy user
sudo useradd livy

#create a logs dir
mkdir logs

#change the ownership of everything
sudo chown -R livy:livy ./*

#start livy with the right config (see below)
sudo -u livy /opt/livy/bin/livy-server

livy-env.sh
===============================================
export SPARK_HOME=/opt/cloudera/parcels/SPARK2/lib/spark2
export HADOOP_HOME=/opt/cloudera/parcels/CDH
export SPARK_CONF_DIR=/etc/spark/conf
export HADOOP_CONF_DIR=/etc/hadoop/conf

livy.conf
===============================================
# What spark master Livy sessions should use.
livy.spark.master = yarn

# What spark deploy mode Livy sessions should use.
livy.spark.deployMode = cluster

# If livy should impersonate the requesting users when creating a new session.
livy.impersonation.enabled = true

# Whether to enable HiveContext in livy interpreter, if it is true hive-site.xml will be detected
# on user request and then livy server classpath automatically.
livy.repl.enableHiveContext = true

livy.server.launch.kerberos.keytab = /opt/livy/livy.keytab
livy.server.launch.kerberos.principal=livy/server.fqdn@XXX

livy.impersonation.enabled = true
livy.server.auth.type = kerberos

livy.server.auth.kerberos.keytab=/opt/livy/spnego.keytab
livy.server.auth.kerberos.principal=HTTP/server.fqdn@XXX

livy.server.access_control.enabled=true
livy.server.access_control.users=zeppelin,livy

livy.superusers=zeppelin,livy

You must also configure Zeppelin for kerberos auth:

zeppelin.livy.keytab	/opt/zeppelin/zeppelin.keytab
zeppelin.livy.principal	zeppelin@XXX
zeppelin.livy.url	http://host.fqdn:8998

Finally you must configure shiro, I used an LDAP backend with org.apache.zeppelin.realm.LdapGroupRealm plugin to enable LDAP user group awareness (take care to set up groupOfNames not posixGroup...).

View solution in original post

____r · ‎03-23-2017

Solved this.

Hive from Zeppelin w. Livy & Spark2

Add spark2 to the kerberized cluster

https://www.cloudera.com/downloads/spark2/2-0.html

https://www.cloudera.com/documentation/enterprise/latest/topics/cm_mc_addon_services.html#concept_qb...

Adapt livy to recognize spark2 if required

https://community.cloudera.com/t5/Web-UI-Hue-Beeswax/Spark-2-0-livy-server-3/td-p/48562

#build livy for spark2
mvn clean package -DskipTests -Dspark-2.0 -Dscala-2.11

#make livy hive aware
sudo ln -s /etc/hive/conf/hive-site.xml /etc/spark/conf/hive-site.xml

#make a livy user
sudo useradd livy

#create a logs dir
mkdir logs

#change the ownership of everything
sudo chown -R livy:livy ./*

#start livy with the right config (see below)
sudo -u livy /opt/livy/bin/livy-server

livy-env.sh
===============================================
export SPARK_HOME=/opt/cloudera/parcels/SPARK2/lib/spark2
export HADOOP_HOME=/opt/cloudera/parcels/CDH
export SPARK_CONF_DIR=/etc/spark/conf
export HADOOP_CONF_DIR=/etc/hadoop/conf

livy.conf
===============================================
# What spark master Livy sessions should use.
livy.spark.master = yarn

# What spark deploy mode Livy sessions should use.
livy.spark.deployMode = cluster

# If livy should impersonate the requesting users when creating a new session.
livy.impersonation.enabled = true

# Whether to enable HiveContext in livy interpreter, if it is true hive-site.xml will be detected
# on user request and then livy server classpath automatically.
livy.repl.enableHiveContext = true

livy.server.launch.kerberos.keytab = /opt/livy/livy.keytab
livy.server.launch.kerberos.principal=livy/server.fqdn@XXX

livy.impersonation.enabled = true
livy.server.auth.type = kerberos

livy.server.auth.kerberos.keytab=/opt/livy/spnego.keytab
livy.server.auth.kerberos.principal=HTTP/server.fqdn@XXX

livy.server.access_control.enabled=true
livy.server.access_control.users=zeppelin,livy

livy.superusers=zeppelin,livy

You must also configure Zeppelin for kerberos auth:

zeppelin.livy.keytab	/opt/zeppelin/zeppelin.keytab
zeppelin.livy.principal	zeppelin@XXX
zeppelin.livy.url	http://host.fqdn:8998

Finally you must configure shiro, I used an LDAP backend with org.apache.zeppelin.realm.LdapGroupRealm plugin to enable LDAP user group awareness (take care to set up groupOfNames not posixGroup...).

maziyar · ‎09-12-2017

Hi,

Something to be careful is when you do "Deploy Client Configuration" on your Spark2 service it will remove the link or the hive-site.xml if you have copied it.

I have noticed all these config are in $SPARK_CONF_DIR/yarn-conf/ so I wish Livy could also load them when it starts up the Spark.

maziyar · ‎09-12-2017

OK, I have tried and it seems it's best to copy hive-site.xml into livy/conf/ and it will load it in every session.

Best,

____r · ‎09-14-2017

Hi,

Set this in livy-env.sh instead to get it working in a more maintainable way:
export HADOOP_CONF_DIR=/etc/hive/conf

Cloudera Community

Support Questions

Zeppelin, Livy, Hive, Kerberos & Spark 1.6