Support Questions

cjervis · ‎11-14-2019

Hello!

Currently, I am working on a project using pySpark that reads in a few Hive tables, stores them as dataframes, and I have to perform a few updates/filters on them. I am avoiding using Spark syntax at all costs to create a framework that will only take SQL in a parameter file that will be run using my pySpark framework.

Now the problem is that I have to perform UPDATE/DELETE queries on my final dataframe, are there any possible work arounds to performing these operations on my dataframe?

Thank you so much!

EricL · ‎11-14-2019

@Timothyw0 ,

No, you can't update or delete in DF. You have to use filter/transform DF and create a new DF.

Cheers
Eric

Cloudera Community

Support Questions

Spark SQL Update/Delete