Support Questions

Find answers, ask questions, and share your expertise

Pyspark Dataframe into Impala table ==> syntax error


Dear all,

I am trying to load a PySpark Dataframe into an Impala table, using the jdbc connector. However, the df.write statement fails, because the "Create table" - statement that is generated contains quotation marks for the column names:


Do you have any idea how to get rid of these quotation marks? If not, what would be a different approach to load a dataframe into an Impala table?

I also tried 

spark.sql('select identifier_id as identifier from tempView').write.jdbc(...), but here I am getting the error "File /tmp/hive does not exist".


Thanks a lot in advance for any help!



Expert Contributor

Hi @Ploeplse 

Could you please share reproducible sample code and impala tab creation script?

Community Manager

HI @Ploeplse, If you are still experiencing the issue, can you provide the information @RangaReddy   has requested?


Vidya Sargur,
Community Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

Expert Contributor

Hi @Ploeplse 


Still, if you are facing the issue, could you share the requested information (i.e code and impala table creation script)