- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Pyspark Dataframe into Impala table ==> syntax error
- Labels:
-
Apache Impala
-
Apache Spark
Created 09-13-2022 02:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear all,
I am trying to load a PySpark Dataframe into an Impala table, using the jdbc connector. However, the df.write statement fails, because the "Create table" - statement that is generated contains quotation marks for the column names:
Do you have any idea how to get rid of these quotation marks? If not, what would be a different approach to load a dataframe into an Impala table?
I also tried
spark.sql('select identifier_id as identifier from tempView').write.jdbc(...), but here I am getting the error "File /tmp/hive does not exist".
Thanks a lot in advance for any help!
Created 09-14-2022 12:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Ploeplse
Could you please share reproducible sample code and impala tab creation script?
Created 09-22-2022 12:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HI @Ploeplse, If you are still experiencing the issue, can you provide the information @RangaReddy has requested?
Regards,
Vidya Sargur,Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 10-11-2022 02:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Ploeplse
Still, if you are facing the issue, could you share the requested information (i.e code and impala table creation script)
