Created on 09-21-2017 11:57 PM - edited 08-18-2019 12:14 AM
Hi
I have a .Net application and am exploring ways to upload data from Sql Server to Hive. I understand I have options of using Sqoop, NiFi, SSIS. I was exploring possibilities with the Hive ODBC driver.
My C# code goes something like this:
string hiveConnString = @"DRIVER={Hortonworks Hive ODBC Driver}; Host=192.168.137.129; Port=10000; Schema=default; HiveServerType=hiveserver2;"; var conn = new OdbcConnection(hiveConnString); OdbcCommand selectCommand = null; OdbcDataAdapter adapter = null; conn.Open(); adapter = new OdbcDataAdapter(); selectCommand = new OdbcCommand(); selectCommand.Connection = conn; selectCommand.CommandText = "..........";// The INSERT QUERY int result = selectCommand.ExecuteNonQuery();
I am able to perform SELECT statements and retrieve data. I was also able to create a table in Hive using ExecuteNonQuery.
However when I try to run INSERT statement, the ExecuteNonQuery method never returns. Initially I had 190 rows in the source table - I waited 20 minutes for the method to return. Then I tried to use just ONE row - same result.
I looked at Hive through Ambari. The queries were running. Am attaching an image of running queries. You can see 4 queries running. The Query column has the INSERT queries that I had submitted. The Application ID column has no id (it says NOT Available).
I have also attached an image of one query details.
I cannot understand why the ODBC driver is not able to execute the statement. The table has a BINARY datatype column (you can see System.Byte[] as the value in the query). Could that be an issue. I would assume that a mismatch on any datatype would return error.
Also, I do not see any way to kill the queries.
Any help is much appreciated.
regards
Vinay
Created 09-22-2017 03:55 AM
Can you please check the status on YARN and if those jobs are running ?
Select * and creating hive tables doesn't need YARN.
Created 09-22-2017 10:24 PM
@nyakkanti Thanks for your response. Not sure what you are suggesting. My issue was not with select * or create table - those worked fine. Issue is with INSERT statements - those are still running. The images I had attached show the running queries. I don't see anything of use in the YARN service summary or YARN Queue Manager. Pls elaborate on where to check for the jobs that are running those queries.
Created 09-23-2017 04:21 AM
Can you please check yarn RM UI pointed out in the tutorial and see if those jobs are running on YARN.
Created 09-23-2017 09:11 AM
I do not see any pending jobs in YARN. Only 3 jobs are listed and the status is finished\succeeded.
In Ambari, I can still see the 4 queries running as mentioned in the original post.
Created 09-23-2017 09:23 AM
I do not see any pending jobs in YARN. Only 3 jobs are listed and the status is finished\succeeded.
In Ambari, I can still see the 4 queries running as mentioned in the original post.