add columns to hive/parquet table
- Labels:
-
Apache Hive
-
Apache Spark
Created on 04-10-2016 11:44 AM - edited 09-16-2022 03:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to add columns to table that I created with the “saveAsTable” api.
I update the columns using sqlContext.sql(‘alter table myTable add columns (mycol string)’).
The next time I create a df and save it in the same table, with the new columns I get a :
“ParquetRelation
requires that the query in the SELECT clause of the INSERT INTO/OVERWRITE statement generates the same number of columns as its schema.”
Also thise two commands don t return the same columns :
1. sqlContext.table(‘myTable’).schema.fields <— wrong result
2. sqlContext.sql(’show columns in mytable’) <—— good results
It seems to be a known bug : https://issues.apache.org/jira/browse/SPARK-9764 (see related bugs)
But I am wondering, how else can I update the columns or make sure that spark take the new columns?
I already tried to refreshTable and to restart spark.
thanks