09-09-2017 03:56 PM
I have a table CustomerQueries in SQL Server. The columns are
CUSTOMER, THRESHOLD, SQL_QUERY, OUTPUT_FROM_HIVE, HIVE_STATUS
The primary key is a composite key - customer + threshold.
This may hold a couple of hundred records. OUTPUT_FROM_HIVE, HIVE_STATUS should be updated later.
Now my requirement is run the queries in Hive and update the columns OUTPUT_FROM_HIVE, HIVE_STATUS in above table in SQL Server.
The result set from Hive is only a few records (Max 10 records).
Could you please suggest any approach for this?
09-10-2017 06:58 AM
Thanks for your response.
I can do sqoop import the table to Hive. Then I am not sure how to read the table and execute the queries stored in the table, and store the output from hive queries. I need to set up this in Oozie work flow.
09-10-2017 07:18 AM
If you need to export complete hive table then use this command:
sqoop export --connect 'RDBMS jdbc connection ' --username 'rdbms user' --password 'rdbms password' --table 'rdbms table' --hcatalog-table 'hive table'
Fetch the datafile from the beeline based on using query. Put it on HDFS and then do sqoop export from this file.
sqoop export --connect 'jdbc connection' --username ur_user --password ur_password--table your_table_name --export-dir 'Your HDFS Data Directory' --input-fields-terminated-by 'your csv terminator'
There must be a way to import this file into the SQL server table as well.