Support Questions

Ploeplse · ‎09-13-2022

Dear all,

I am trying to load a PySpark Dataframe into an Impala table, using the jdbc connector. However, the df.write statement fails, because the "Create table" - statement that is generated contains quotation marks for the column names:

Do you have any idea how to get rid of these quotation marks? If not, what would be a different approach to load a dataframe into an Impala table?

I also tried

spark.sql('select identifier_id as identifier from tempView').write.jdbc(...), but here I am getting the error "File /tmp/hive does not exist".

Thanks a lot in advance for any help!

RangaReddy · ‎09-14-2022

Hi @Ploeplse

Could you please share reproducible sample code and impala tab creation script?

VidyaSargur · ‎09-22-2022

HI @Ploeplse, If you are still experiencing the issue, can you provide the information @RangaReddy has requested?

Regards,

Vidya Sargur,
Community Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Community Guidelines
How to use the forum

RangaReddy · ‎10-11-2022

Hi @Ploeplse

Still, if you are facing the issue, could you share the requested information (i.e code and impala table creation script)

Support Questions

Pyspark Dataframe into Impala table ==> syntax error