28798
DISCUSSIONS
102179
MEMBERS
3161
ARTICLES
Created on 11-28-2016 10:36 PM - edited 11-28-2016 10:56 PM
Hi, I'm using SPARK2-2.0.0.cloudera.beta2-1.cdh5.7.0.p0.110234.
I'm trying to save spark dataframe into hive table.
df.write.mode(SaveMode.Overwrite).partitionBy("date").saveAsTable(s"$databaseName.$tableName")
I can list the table in beeline shell. However, I cannot read the content, because the table schema is not what I expected :
+-----------+----------------+--------------------+--+
| col_name | data_type | comment |
+-----------+----------------+--------------------+--+
| col | array<string> | from deserializer |
+-----------+----------------+--------------------+--+
I've tried spark1.6.0-cdh5.9.0-hadoop2.6.0, but got the same result.
=== update 2016-11-29 14:50 ===
I realized that Spark SQL specific format, which is NOT compatible with Hive. So, I changed to:
However, every time I do a query using beeline, the Hive metastore server will crash. If I query using Impala, the metastore server works well.
the write operation succeed some times. Then can be queried by Impala but not beeline.
sometimes, the write operation failed with error ERROR KeyProviderCache: Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !! and the metastore server crashes.
Thanks.