Issue with different schema merging in Iceberg tables from Spark
During running a merge from Spark to an Iceberg table like this:
df.writeTo("tablename").append()
The following error would appear if the schema of DataFrame and Table are different:
Exception in thread "main" org.apache.spark.sql.AnalysisException: Cannot write incompatible data to table 'catalog.tablename':
- Cannot find data for output column 'column3'
- Cannot find data for output column 'column4'
This is happening because by default Iceberg doesn't accept different schema to be merged.
Problem solution
We need to add following table properties to our Iceberg table, to accept different schema merging:
ALTER TABLE tablename SET TBLPROPERTIES (
'write.spark.accept-any-schema'='true'
)
After that we need to add MergeSchema option true to our append command in Spark: