- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
pyspark - can not create managed table
Created 06-14-2022 01:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm writing some pyspark code where I have a dataframe that I want to write to a hive table.
I'm using a command like this.
dataframe.write.mode("overwrite").saveAsTable(“bh_test”)
Everything I've read online indicates that this should, by default, create a managed table.
However, no matter what I try, it always creates an external table. Is there a configuration setting somewhere that overrides the default behavior?
Created 06-15-2022 08:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @haze5736, You need to use Hive Warehouse Connector (HWC) software to query Apache Hive managed tables from Apache Spark. Using HWC API you can read and write Apache Hive tables from Apache Spark. For example, to write the managed table.
df.write.format(HiveWarehouseSession().HIVE_WAREHOUSE_CONNECTOR).option("table", &tableName>).option("partition", <partition_spec>).save()
Ref: https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/integrating-hive-and-bi/topics/hive-read-writ...
For more details you can refer the below documentation:
https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/integrating-hive-and-bi/topics/hive_hivewareh...
https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/integrating-hive-and-bi/topics/hive_submit_a_...
Created 06-15-2022 08:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @haze5736, You need to use Hive Warehouse Connector (HWC) software to query Apache Hive managed tables from Apache Spark. Using HWC API you can read and write Apache Hive tables from Apache Spark. For example, to write the managed table.
df.write.format(HiveWarehouseSession().HIVE_WAREHOUSE_CONNECTOR).option("table", &tableName>).option("partition", <partition_spec>).save()
Ref: https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/integrating-hive-and-bi/topics/hive-read-writ...
For more details you can refer the below documentation:
https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/integrating-hive-and-bi/topics/hive_hivewareh...
https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/integrating-hive-and-bi/topics/hive_submit_a_...
Created 06-27-2022 09:33 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@haze5736 Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks
Regards,
Diana Torres,Community Moderator
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
