New Contributor
Posts: 3
Registered: ‎05-10-2017

Error running Hivecontext in pyspark

Using Cloudera Quickstart VM 5.10
Spark version 1.6.0
Copied hive-site.xml to spark directory


>>> from pyspark.sql import HiveContext
>>> sqlContext = HiveContext(sc)
>>> cnt = sqlContext.sql("select count(1) from customers")


When I am trying to get Hive DB data from PySpark context , I am getting the below error.

17/05/05 15:05:01 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.1.0
17/05/05 15:05:01 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
_17/05/05 15:05:03 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.

Posts: 777
Registered: ‎05-16-2016

Re: Error running Hivecontext in pyspark

[ Edited ]
  1. You can either turn short-cricuit feature - which will have preformance hit 
  2. by false 



Enable native lib by following the link . 

 meanitime you can also check if it loaded or not by firing the below command 


hadoop checknative -a


but it is more of a WARN . You should be able to by pass it and stil would be able to get the results .