About er_sharma_shant

er_sharma_shant · ‎09-19-2018

@Tongzhou Zhou Try this: 1. Ensure hive-site.xml in hive-conf dir and spark-conf is identical, below command should not return anything. diff /etc/hive/conf/hive-site.xml /etc/spark2/conf/hive-site.xml 2. Invoke REPL spark session (pyspark or spark-shell) $pyspark 3. Show hive databases spark.sql("show databases") Are you able to access hive tables now?

er_sharma_shant · ‎09-17-2018

After commenting below properties it's works for me as well. ## To be commented out when not using [user] block / paintext passwordMatcher = org.apache.shiro.authc.credential.PasswordMatcher iniRealm.credentialsMatcher = $passwordMatcher

er_sharma_shant · ‎09-17-2018

After copying hive-site.xml from hive-conf dir to spark-conf dir, I restarted the spark services that reverted those changes, I copied hive-site.xml again and it's working now. cp /etc/hive/conf/hive-site.xml /etc/spark2/conf

er_sharma_shant · ‎09-15-2018

The default database it was showing was the default database from Spark which has location as '/apps/spark/warehouse', not the default database of Hive. I am able to resolve this by copying hive-site.xml from hive-conf dir to spark-conf dir. cp /etc/hive/conf/hive-site.xml /etc/spark2/conf Try to run this query in your metastore database, in my case it is MySQL. mysql> SELECT NAME, DB_LOCATION_URI FROM hive.DBS; You will see 2 default databases there, one pointing to 'spark.sql.warehouse.dir' and other to 'hive.metastore.warehouse.di'. Location will depend in what value you have for these configuration properties.

er_sharma_shant · ‎09-15-2018

I have installed Hortonworks hdp3.0 and configured Zeppelin as well. When I running spark or sql Zeppelin only showing me default database(This is the default database from Spark which has location as '/apps/spark/warehouse', not the default database of Hive). This is probably because hive.metastore.warehouse.dir property is not set from hive-site.xml and zeppelin is picking this from Spark config (spark.sql.warehouse.dir). I had similar issue with spark as well and it was due to hive-site.xml file on spark-conf dir, I was able to resolve this by copying hive-site.xml from hive-conf dir to spark-conf dir. I did the same for Zeppelin as well, copied hive-site.xml in zeppelin dir(where it has zeppelin-site.xml and also copied in zeppelin-external-dependency-conf dir. But this did not resolve the issue *** Edit#1 - adding some additional information *** I have create spark session by enabling hive support through enableHiveSupport(), and even tried setting spark.sql.warehouse.dir config property. but this did not help. import org.apache.spark.sql.SparkSession val spark =SparkSession.builder.appName("Test Zeppelin").config("spark.sql.warehouse.dir","/apps/hive/db").enableHiveSupport().getOrCreate() Through some online help, I am learnt that Zeppelin uses only Spark's hive-site.xml file, but I can view all hive databases through spark it's only in Zeppelin (through spark2) I am not able to access Hive databases. Additionaly Zeppelin is not letting me choose programming language, it by default creates session with scala. I would prefer a Zeppeling session with pyspark. Any help on this will be highly appreciated

er_sharma_shant · ‎09-14-2018

I have installed hdp3.0 and using Spark 2.3 and Hive 3.1. When I am trying to access hive tables though spark(pyspark/spark-shell) then I am getting below error. Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/hdp/current/spark2-client/python/pyspark/sql/session.py", line 716, in sql return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped) File "/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__ File "/usr/hdp/current/spark2-client/python/pyspark/sql/utils.py", line 71, in deco raise AnalysisException(s.split(': ', 1)[1], stackTrace) pyspark.sql.utils.AnalysisException: u"Database 'test' not found;" Only default hive database is visible in Spark. >>> spark.sql("show databases").show() +------------+ |databaseName| +------------+ | default| +------------+ >>> Content of hive-site.xml is not exactly same in spark/conf and hive/conf dir. -rw-r--r-- 1 hive hadoop 23600 Sep 14 09:21 /usr/hdp/current/hive-client/conf/hive-site.xml -rw-r--r-- 1 spark spark 1011 Sep 14 12:02 /etc/spark2/3.0.0.0-1634/0/hive-site.xml I even tried initiated spark session with hive/conf/hive-site.xml, even this did not help. pyspark --files /usr/hdp/current/hive-client/conf/hive-site.xml Should I copy hive-site.xml file from hive-conf to spark-conf dir (or anywhere else as well)? Or changing a property Ambari UI will work?

er_sharma_shant · ‎09-14-2018

I tried starting HS2 after setting JAVA_HOME but it did not help. 2018-09-14 08:29:31: Starting HiveServer2 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/3.0.0.0-1634/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/3.0.0.0-1634/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Hive Session ID = d410dd44-b3ed-4c73-a391-f414da52f946 Hive Session ID = c7b15435-16b1-4f74-ac9a-a5f5fb09af35 Hive Session ID = 2bbef091-f52a-44a6-b1af-d1f78b30fb88 Hive Session ID = 4b405953-cf8d-4d72-a10c-49ccf873e03b Hive Session ID = cc1d904c-b920-42da-8787-c9be4afabc2b This is error what I see when I invoke Hive : Error: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read HiveServer2 configs from ZooKeeper (state=,code=0)

er_sharma_shant · ‎09-14-2018

Thanks @Jay Kumar SenSharma for your prompt responses! I have manually validated hive-site.xml, it has correct entry. Later I started hiveserver2 services as suggested manually. Here is what I see in nohup.out file. +======================================================================+ | Error: JAVA_HOME is not set | +----------------------------------------------------------------------+ | Please download the latest Sun JDK from the Sun Java web site | | > http://www.oracle.com/technetwork/java/javase/downloads | | | | HBase requires Java 1.8 or later. | +======================================================================+ 2018-09-13 23:25:07: Starting HiveServer2 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/3.0.0.0-1634/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/3.0.0.0-1634/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Hive Session ID = a7aa8a99-1a27-42b9-977a-d545742d1747 Hive Session ID = 41c8d3d5-e7ca-42df-acd5-2b8d1f23220c Hive Session ID = 84951be6-015e-4634-a4de-6a6f18d51bb8 Hive Session ID = 1964a526-53da-4b50-90a9-9bc90142e2ff Hive Session ID = f6188f9b-f3a9-43ec-a9ef-6a8a87a01498 I see JAVA_HOME not set error in nohup.out file, So ran the below command to set the JAVA_HOME and reran the command to start hiveserver2 manually, this time I did not get Error: JAVA_HOME is not set but still facing same issue. Do you suggest any other solution for this? My ambari dashboard looks like this (Image Attached), This cluster is running since 36 hours but no job ran because Hive is not working. Does this n/a and not data available is fine? All the links eg. resource manage, namenode web interface information. data node information are available and are showing correct information. FYI - Initially I was installing Hive Metastore(MySQL) and HiveServer2 service on worker2 node but had issue while testing connection to metastore (DBConnectionVerification.jar file issue) so moved metastore and service to master and that issue got resolved.

er_sharma_shant · ‎09-14-2018

I have validated hive.server2.support.dynamic.service.discovery is true (check-box is selected) and I am still facing the same issue.

er_sharma_shant · ‎09-14-2018

I am also facing same issue. After login into zk command like. I could only see below content. I can't see HS2 [zk: localhost:2181(CONNECTED) 10] ls / [registry, ambari-metrics-cluster, zookeeper, zk_smoketest, rmstore] [zk: localhost:2181(CONNECTED) 11] Could you Please help me on this? When I was starting HiveServer2 first time after installation I got below error. Tried multiple times but HS2 was never able to start. Traceback (most recent call last): File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/decorator.py", line 54, in wrapper return function(*args, **kwargs) File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HIVE/package/scripts/hive_service.py", line 189, in wait_for_znode raise Fail(format("ZooKeeper node /{hive_server2_zookeeper_namespace} is not ready yet")) Fail: ZooKeeper node /hiveserver2 is not ready yet The above exception was the cause of the following exception: Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HIVE/package/scripts/hive_server.py", line 137, in <module> HiveServer().execute() File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 353, in execute method(env) File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 993, in restart self.start(env, upgrade_type=upgrade_type) File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HIVE/package/scripts/hive_server.py", line 53, in start hive_service('hiveserver2', action = 'start', upgrade_type=upgrade_type) File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HIVE/package/scripts/hive_service.py", line 101, in hive_service wait_for_znode() File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/decorator.py", line 62, in wrapper return function(*args, **kwargs) File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HIVE/package/scripts/hive_service.py", line 189, in wait_for_znode raise Fail(format("ZooKeeper node /{hive_server2_zookeeper_namespace} is not ready yet")) resource_management.core.exceptions.Fail: ZooKeeper node /hiveserver2 is not ready yet

Online	Offline
Last Visited	‎11-13-2018 01:34 PM

Member Since	‎09-04-2018 01:11 PM
Last Visited	‎11-13-2018 01:34 PM
Posts	33
Kudos received	2

Cloudera Community

Re: spark-sql : Error in session initiation - NoC...

Re: Spark 2.3 : pyspark.sql.utils.AnalysisExceptio...

Re: Zeppelin : Not able to connect Hive Databases ...

Re: How to enable Basic Authentication for Zeppeli...

Re: Zeppelin : Not able to connect Hive Databases ...

Re: Spark 2.3 : pyspark.sql.utils.AnalysisExceptio...

Zeppelin : Not able to connect Hive Databases (thr...

Spark 2.3 : pyspark.sql.utils.AnalysisException: u...

Re: HiveServer2 Not Starting : ZooKeeper node does...

Re: HiveServer2 Not Starting : ZooKeeper node does...

Re: HiveServer2 Not Starting : ZooKeeper node does...

Re: HiveServer 2 goes down