Support Questions
Find answers, ask questions, and share your expertise

Facing issues while ingesting data into Hive

Facing issues while ingesting data into Hive

Hi,

We are ingesting data from SQL Server (OLTP) to Hive. Incremental updates for updated data not working against SQL Server.

It's still open issue https://issues.apache.org/jira/browse/SQOOP-2421

Is there any work around to get the updated data into Hive tables?

Thanks

6 REPLIES 6

Re: Facing issues while ingesting data into Hive

You can obviously just use a manual filter ( WHERE condition ) in Sqoop. For example run sqoop hourly in oozie and get the past hour of data.

Re: Facing issues while ingesting data into Hive

As you suggested to use the sqoop saved job in https://community.hortonworks.com/questions/13132/best-practice-to-import-data-from-sql-server-to-hi....

If i use the where condition then i have to supply the value in the where clause manually. I have to create 100 tables for each client. It would be a nightmare to run each job manually by supplying the value.

Do we have any way to do it in friendly way?

Thanks

Re: Facing issues while ingesting data into Hive

What I meant with the where condition is that you could run an oozie job that runs daily and adds the filter parameters automatically.

Re: Facing issues while ingesting data into Hive

Hi Benjamin ,

As sqoop incremental mode is not working against the SQL server. Sqoop query clause is the option left. We need to run sqoop query for last date if there are no changes to the historical data in sql server. In case of changes in historical data, query needs to run query for the historical dates where there are changes. Based on the changes in the historical data we need different values of the dates in the where clause of the query.

Do oozie varibles or parameters can handle this situation??

Re: Facing issues while ingesting data into Hive

Mentor

@Nirvana India

Can this help you

Re: Facing issues while ingesting data into Hive

@Geoffrey Shelton Okot

Exporting & import data is working good. Facing issues with the incremental mode import against the SQL Server.

It's still open issue https://issues.apache.org/jira/browse/SQOOP-2421