Member since
06-19-2019
3
Posts
0
Kudos Received
0
Solutions
08-12-2019
10:10 AM
Hi, Hadoop Version is 3.1 Using pySpark and HWC connector i am trying to overwrite an external table created in Hive. First created the external table in ORC format. Then using a query as given below , i am trying to overwrite the data in that table. But after running when i checked the path of the table location using create table statement, its changed into internal hive db path. How to store the data into external location using HWC connector?. Please suggest. import pyspark
from pyspark_llap import HiveWarehouseSession
hive = HiveWarehouseSession.session(spark).build()
hive.setDatabase("test")
result = hive.executeQuery ("SELECT \
c.number , \
act.CloseDate ,\
LTRIM(CONCAT(con.FirstName,' ',con.LastName))
FROM table_1 c \
INNER JOIN table_2 act ON act.id = c.ID \
LEFT JOIN table_3 con ON con.ID = act.userID")
result .write.format(HiveWarehouseSession.HIVE_WAREHOUSE_CONNECTOR).mode("overwrite").option("table", "temp.final_table").save("/user/amrutha/abc/final_table")
... View more
06-21-2019
04:20 AM
Thanks Shu. The below command worked for me. sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").map(x => hive.executeQuery(x).show())
... View more
06-20-2019
04:20 AM
For Hadoop 3.1 i am using HWC Connector to connect to Hive and execute the queries.Below is the code snippet used for testing. spark-shell --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.1.0.0-78.jar
import com.hortonworks.hwc.HiveWarehouseSession
import com.hortonworks.hwc.HiveWarehouseSession._
val hive = HiveWarehouseSession.session(spark).build()
val query = sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").foreach(qry => hive.executeQuery(qry))
query.show() This is throwing error as given below. <console>:30: error: value show is not a member of Unit Is this not the proper way to run a hql file using Spark. Please suggest.
... View more
Labels: