Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

I have multiple tables which i need to join and fetch results into HIVE. Can I do this using Sqoop?

Solved Go to solution

I have multiple tables which i need to join and fetch results into HIVE. Can I do this using Sqoop?

New Contributor

I have 5 tables belonging to different schemas but have common fields. I need to perform a complex join to fetch data for analysis. I want to store the results in HIVE. Can I do this using sqoop?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: I have multiple tables which i need to join and fetch results into HIVE. Can I do this using Sqoop?

To add the @Artem Ervits's solution - consider saving your complex query in the source RDBMS as a view. Then just call Sqoop specifying the view name instead of a table name. Makes the code nice and clean.

4 REPLIES 4

Re: I have multiple tables which i need to join and fetch results into HIVE. Can I do this using Sqoop?

Mentor

Have you looked at free-form query option in sqoop?

https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_free_form_query_imports

I would suggest to save result of a complex query in a database and sqoop that as one dataset.

There is however a note on free-form queries.

Note
The facility of using free-form query in the current version of Sqoop is limited to simple queries where there are no ambiguous projections and no OR conditions in the WHERE clause. Use of complex queries such as queries that have sub-queries or joins leading to ambiguous projections can lead to unexpected results.

Re: I have multiple tables which i need to join and fetch results into HIVE. Can I do this using Sqoop?

New Contributor

Hi @Artem Ervits. Thanks for the reply. I had read on free from query imports but decided against it because we cannot use ANDs and ORs. I will however try and see what happens. I cant work on the second option you and @bpreachuk suggested as i dont have access to create tables/views in the RDBMS. So the way i'll go is try using the free form query. If that doesnt work i'll import the tables as is from the RDBMS and run complex queries using HIVE. I'll update you on how it goes.

Re: I have multiple tables which i need to join and fetch results into HIVE. Can I do this using Sqoop?

To add the @Artem Ervits's solution - consider saving your complex query in the source RDBMS as a view. Then just call Sqoop specifying the view name instead of a table name. Makes the code nice and clean.

Highlighted

Re: I have multiple tables which i need to join and fetch results into HIVE. Can I do this using Sqoop?

New Contributor

I was able to import all tables in the following format:

sqoop import -connect jdbc:oracle:thin:@<fdqn>/<server> -username <username> -P -table CUST_NAV -columns "<column names separated by commas" -hive-import -hive-table databasenameinhive.New_CUST_NAV -target-dir 'location in hdfs'

@bpreachuk I understood the workaround in my problem using your suggestion. I'll import all tables as is from the oracle db and create different views which i can then use in my select statements.

Thanks guys.

Don't have an account?
Coming from Hortonworks? Activate your account here