Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Reading 1000 tables from RDBMS via pyspark and create parquet tables in hive

Solved Go to solution

Reading 1000 tables from RDBMS via pyspark and create parquet tables in hive

Explorer

Team,

 

I have 1000 tables in my source RDBMS and I would like to get them migrated to hive using pyspark

 

I read through documentation and found that below two commands would help. Is there a way I can loop these two commands 1000 times if I have all the list of tables in a python array?

 

arr = ("table1","table2")

for x in arr:

            df = spark.read.format("jdbc").blah.blah 

 

            data.write.saveAsTable.blah.blah

 

If someone has a working solution for this could you please share. I tried but it is not throwing any error but at same time not writing anything.

 

Thanks

Meher

 

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Reading 1000 tables from RDBMS via pyspark and create parquet tables in hive

Explorer

I'm able to get this working. Will close this post. 

 

Thanks,

Meher

2 REPLIES 2
Highlighted

Re: Reading 1000 tables from RDBMS via pyspark and create parquet tables in hive

Explorer

I'm able to get this working. Will close this post. 

 

Thanks,

Meher

Re: Reading 1000 tables from RDBMS via pyspark and create parquet tables in hive

Community Manager

@Meher I am happy to see that you resolved your issue. Would you mind sharing how you solved it in case someone else encounters the same situation?



Cy Jervis, Community Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:
Community Guidelines
How to use the forum
Don't have an account?
Coming from Hortonworks? Activate your account here