09-14-2017 06:48 PM
I am trying to save a dataframe as a hive table using <dataframe>.write.saveAsTable method using pyspark.
The command gives warning, creates directory in dfs but not the table in hive metastore.
I was reading many old posts which say that this command doesnt work and I need to create the table manually pointing to the directory created by the above pyspark command.
Just wanted to ask in this forum if this is true and this limitation exist with current version of cloudera.
09-14-2017 09:12 PM
It is late so I am not recalling the specifics but yes, I recommend always creating the Hive table definition outside of Spark. I vaguelyl recall that if you let Spark create it other services can't use it and Spark having issues translating the metadata correctly to the Hive specification.