- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Not able to append records to table using df.write.saveAsTable()
- Labels:
-
Apache Spark
Created on 12-27-2017 05:53 AM - edited 09-16-2022 05:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am not able to append records to a table using the follwing command :-
df.write.saveAsTable("table")
df.write.saveAsTable("table",mode="append")
error:- IllegalArgumentException: 'Expected only one path to be specified but got : '
Created 12-27-2017 05:57 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think you mean something like df.write.mode(SaveMode.Overwrite).saveAsTable(...) ? Depends on what language this is.
Created 12-27-2017 06:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
df.write.saveAsTable(name="table",mode="append")
Created 07-04-2018 08:46 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have the exact same problem.
Spark 2.2.0.cloudera1.
Can you please tell me what's wrong with this code?
It's the easiest Java version of the example code in Spark's Javadoc example (here).
The code:
List<Row> rows = new ArrayList<>();
Object[] cols = new Object[2]; cols[0] = "one"; cols[1] = 1;
rows.add(RowFactory.create(cols));
spark.createDataFrame(rows, st).write().format("parquet").mode(SaveMode.Overwrite).saveAsTable("my_scheme.my_table");
rows = new ArrayList<>();
cols[0] = "two"; cols[1] = 2;
rows.add(RowFactory.create(cols));
spark.createDataFrame(rows, st).write().format("parquet").mode(SaveMode.Append).saveAsTable("my_scheme.my_table");
The error:
java.lang.IllegalArgumentException: Expected exactly one path to be specified, but got:
at org.apache.spark.sql.execution.datasources.DataSource.writeInFileFormat(DataSource.scala:410)
at org.apache.spark.sql.execution.datasources.DataSource.writeAndRead(DataSource.scala:454)
at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.saveDataIntoTable(createDataSourceTables.scala:198)
at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:148)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:610)
at org.apache.spark.sql.DataFrameWriter.createTable(DataFrameWriter.scala:420)
at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:399)
at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:354)
Created 12-19-2018 07:02 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Had trouble with this as well, but removing the ".mode(...)" actually worked, AND it appended.
spark.read.parquet("/path/to/parq1.parq","/path/to/parq2.parq").coalesce(1).write.format("parquet").saveAsTable("db.table")
