04-07-2017 07:33 AM
04-07-2017 08:25 AM
Thank you very much for the advice Saranvisa. We are microsoft Shop at work ODBC drivers will be suitable for us. We use SSIS 2014 ETL tool for all our Data integration, i dont think we can change or have skill on java side.. . I was checking Impala Drivers for integrating Data from RDBMS like Teradata into Hadoop and vice versa. Not sure, what will be performance , how this will work out. Please share if anyone has implemented this setup. I read couple of articles about using Hive and Impala. Impala shown better performance. I am not set on using Impala only. If anyone can help me identify a better solution that will be excellent.
04-07-2017 08:40 AM
You can achieve all your needs without using Java, just focus on the below tool under each category.. below is just a 'very high level' notes to understand the tools
1. Data Ingestion -> Sqoop Import/export -> Hive/Impala.
a. Use this link & change version based on your environment
b. Once you load data into hive, you can use the same data from either Hive or Impala
c. Sqoop will use mapreduce behind the scene, but you don't need to worry about mapreduce now
2. Hive -> for the normal usage, batch processing, etc. Performance will be low but it will not consume more memory
3. Impala -> to get quick result, but it will consume more memory.
If you need to apply multiple quires, use impala for important quries where you need quick result & hive for others
04-12-2017 08:55 AM
Sorry for the Delay.. the Bad news is They wont Support either Sqoop or Drill. Hadoop flavour We have is Apache.. it completly went out of my mind.
My question is. if we install Impala drivers on my Ingegration Services Box<AppBox>.. can this connect to Apachage Hadoop and Data ingestion can be Achieved?