Spark SQL 2.3. CDH 5.16.2
I am trying to save DF to a table in spark. The table already created in Hive.
But when Spark job running I do a show create table t_table in Hive.
It gives me table not found error. after spark job finish I can find the table again.
My questions is
Is there any way to avoid such issue? It will cause issue if other programs refer same table in Hive
df.write.mode("overwrite").partitionBy("day_string").format("parquet").option("compression", "snappy").saveAsTable("t_table")
Created 08-14-2019 10:07 AM
spark.conf.set("spark.dynamicAllocation.enabled", "true")
spark.conf.set("spark.sql.sources.partitionOverwriteMode", "dynamic")
spark.conf.set("spark.sql.files.maxPartitionBytes", "268435456")
spark.conf.set("spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation", "true")
config applied in spark session
Created 08-14-2019 08:59 PM
Is this a Bug or it is a known issue or there is some config we can avoid such issue
Created 08-14-2019 09:18 PM
I think below comments explain the reason
When mode is Overwrite, the schema of the DataFrame does not need to be the same as that of the existing table.