05-10-2017 01:15 PM
Using Cloudera Quickstart VM 5.10
Spark version 1.6.0
Copied hive-site.xml to spark directory
>>> from pyspark.sql import HiveContext
>>> sqlContext = HiveContext(sc)
>>> cnt = sqlContext.sql("select count(1) from customers")
When I am trying to get Hive DB data from PySpark context , I am getting the below error.
17/05/05 15:05:01 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.1.0
17/05/05 15:05:01 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
_17/05/05 15:05:03 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
05-11-2017 06:44 PM - edited 05-11-2017 06:45 PM
<property> <name>dfs.client.read.shortcircuit</name> <value>true</value> </property>
Enable native lib by following the link .
meanitime you can also check if it loaded or not by firing the below command
hadoop checknative -a
but it is more of a WARN . You should be able to by pass it and stil would be able to get the results .