Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Who agreed with this topic

add columns to hive/parquet table

avatar
Rising Star

I am trying to add columns to table that I created with the “saveAsTable” api. 

I update the columns using sqlContext.sql(‘alter table myTable add columns (mycol string)’). 

The next time I create a df and save it in the same table, with the new columns I get a :

“ParquetRelation

 requires that the query in the SELECT clause of the INSERT INTO/OVERWRITE statement generates the same number of columns as its schema.”

 

Also thise two commands don t return the same columns :

1. sqlContext.table(‘myTable’).schema.fields    <— wrong result

2. sqlContext.sql(’show columns in mytable’)  <—— good results

 

It seems to be a known bug : https://issues.apache.org/jira/browse/SPARK-9764 (see related bugs)

 

But I am wondering, how else can I update the columns or make sure that spark take the new columns?

 

I already tried to refreshTable and to restart spark.

 

thanks

Who agreed with this topic