Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Migration Oracle database to hadoop using Python.

Highlighted

Migration Oracle database to hadoop using Python.

New Contributor

Hi team,

  I want to migrate  from oracle database to hadoop using python scrip without Sqoop & Spark .What is the easy way for migrating .

 

2 REPLIES 2
Highlighted

Re: Migration Oracle database to hadoop using Python.

Contributor

Sqoop is the most appropriate tool for this. Why don't you want to use it?

Highlighted

Re: Migration Oracle database to hadoop using Python.

New Contributor

You could connect to the oracle database from python using any interface. Lets say - cx_Oracle from - 'https://www.oracle.com/technetwork/articles/dsl/python-091105.html'. You will now be left with crafting the python script to run in a multi process fashion on a single machine to get as much data as possible quickly. The next step will be to make use of all the available machines in your cluster to do that job for you.

 

The tool - sqoop cropped out of the requirements of dumping loads of data between Hadoop to other RDBMSs. The tool handles creation of multiple map/reduce jobs to achieve the right performance and resource utilization.

 

Sqoop could also create tables in Hive so as to save you those steps. 

 

Any reason you would be unable to use sqoop to save most parts of the exercise?

Don't have an account?
Coming from Hortonworks? Activate your account here