Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Import data from RDBMS to HDFS

New Contributor

i am new to Hadoop, how to move data from Teradata DW to Hadoop using ssis. I really appreciate how i can achieve this.

5 REPLIES 5

New Contributor

Thank you very much for the advice Saranvisa. We are microsoft Shop at work ODBC drivers will be suitable for us. We use SSIS 2014 ETL tool for all our Data integration,  i dont think we can change or have skill on java side.. . I was checking Impala Drivers for integrating Data from RDBMS like Teradata into Hadoop and vice versa. Not sure, what will be performance , how this will work out. Please share if anyone has implemented this setup. I read couple of articles about using Hive and Impala. Impala shown better performance. I am not set on using Impala only. If anyone can help me identify a better solution that will be excellent. 

Champion

@JayImpala

 

You can achieve all your needs without using Java, just focus on the below tool under each category.. below is just a 'very high level' notes to understand the tools

 

1. Data Ingestion -> Sqoop Import/export -> Hive/Impala. 

 

a. Use this link & change version based on your environment

https://sqoop.apache.org/docs/1.4.1-incubating/SqoopUserGuide.html#_selecting_the_data_to_import

b. Once you load data into hive, you can use the same data from either Hive or Impala

c. Sqoop will use mapreduce behind the scene, but you don't need to worry about mapreduce now

 

2. Hive -> for the normal usage, batch processing, etc. Performance will be low but it will not consume more memory

 

3. Impala -> to get quick result, but it will consume more memory.

 

If you need to apply multiple quires, use impala for important quries where you need quick result & hive for others

New Contributor

Sorry for the Delay.. the Bad news is They wont Support either Sqoop or Drill. Hadoop flavour We have is Apache.. it completly went out of my mind. 

My question is. if we install Impala drivers on my Ingegration Services Box<AppBox>.. can this connect to Apachage Hadoop and Data ingestion can be Achieved? 

 

Explorer
you can also use the spark JDBC feature.
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.