Support Questions

Find answers, ask questions, and share your expertise

[Hive] table partitioned in parquet giving error that it stored in HiveFileFormat

New Contributor

table is `HiveFileFormat`. It doesn't match the specified format `ParquetFileFormat`

 

I have this issue when I try to write using pyspark with the following command:

df.write.mode("append").format("parquet").saveAsTable("schema.table")

 

Before you say change from parquet to hive i know it works. But the thing is the table is partitioned in parquet and I really don't know why not its not working any more. It worked fine until now. The same command ran correctly for 1 month and 5 times so far. But today it does not want to write like this any more.

 

If i check the metadata it also points to everything being in parquet:

``

103SerDe Library:      org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDeNULL
104InputFormat:        org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormatNULL
105OutputFormat:       org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat

``

1 REPLY 1

Rising Star

@ditmarh this might not work in scenarios where the table schema.table is created from Hive, and we are appending to it from Spark. 

 

You may try the following command, replacing saveAsTable with insertInto.

df.write.mode("append").format("parquet").insertInto("schema.table")