Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HWC converting External table into managed table using pyspark

HWC converting External table into managed table using pyspark

New Contributor

Hi,

Hadoop Version is 3.1

Using pySpark and HWC connector i am trying to overwrite an external table created in Hive. First created the external table in ORC format. Then using a query as given below , i am trying to overwrite the data in that table. But after running when i checked the path of the table location using create table statement, its changed into internal hive db path. How to store the data into external location using HWC connector?. Please suggest.

import pyspark
from pyspark_llap import HiveWarehouseSession
hive = HiveWarehouseSession.session(spark).build()

hive.setDatabase("test")
result  = hive.executeQuery ("SELECT \
c.number , \
act.CloseDate ,\
LTRIM(CONCAT(con.FirstName,' ',con.LastName)) 
FROM table_1 c \
INNER JOIN table_2 act ON act.id = c.ID \
LEFT JOIN table_3 con ON con.ID = act.userID")

result .write.format(HiveWarehouseSession.HIVE_WAREHOUSE_CONNECTOR).mode("overwrite").option("table", "temp.final_table").save("/user/amrutha/abc/final_table")


Don't have an account?
Coming from Hortonworks? Activate your account here