Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Who agreed with this topic

add columns to hive/parquet table

Contributor

I am trying to add columns to table that I created with the “saveAsTable” api. 

I update the columns using sqlContext.sql(‘alter table myTable add columns (mycol string)’). 

The next time I create a df and save it in the same table, with the new columns I get a :

“ParquetRelation

 requires that the query in the SELECT clause of the INSERT INTO/OVERWRITE statement generates the same number of columns as its schema.”

 

Also thise two commands don t return the same columns :

1. sqlContext.table(‘myTable’).schema.fields    <— wrong result

2. sqlContext.sql(’show columns in mytable’)  <—— good results

 

It seems to be a known bug : https://issues.apache.org/jira/browse/SPARK-9764 (see related bugs)

 

But I am wondering, how else can I update the columns or make sure that spark take the new columns?

 

I already tried to refreshTable and to restart spark.

 

thanks

Who agreed with this topic