Reply
Explorer
Posts: 17
Registered: ‎08-04-2017
Accepted Solution

Reading 1000 tables from RDBMS via pyspark and create parquet tables in hive

Team,

 

I have 1000 tables in my source RDBMS and I would like to get them migrated to hive using pyspark

 

I read through documentation and found that below two commands would help. Is there a way I can loop these two commands 1000 times if I have all the list of tables in a python array?

 

arr = ("table1","table2")

for x in arr:

            df = spark.read.format("jdbc").blah.blah 

 

            data.write.saveAsTable.blah.blah

 

If someone has a working solution for this could you please share. I tried but it is not throwing any error but at same time not writing anything.

 

Thanks

Meher

 

Explorer
Posts: 17
Registered: ‎08-04-2017

Re: Reading 1000 tables from RDBMS via pyspark and create parquet tables in hive

I'm able to get this working. Will close this post. 

 

Thanks,

Meher

Posts: 903
Kudos: 106
Solutions: 56
Registered: ‎04-06-2015

Re: Reading 1000 tables from RDBMS via pyspark and create parquet tables in hive

@Meher I am happy to see that you resolved your issue. Would you mind sharing how you solved it in case someone else encounters the same situation?



Cy Jervis, Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

Announcements