Created 03-30-2016 06:04 AM
Hi:
I used oozie to execute sqoop which import data from mysql to hive.
But the job was killed.
The error log said:
----------------------------------------------
16313 [main] INFO org.apache.sqoop.mapreduce.ImportJobBase - Transferred 1.0645 KB in 13.3807 seconds (81.4604 bytes/sec) 2016-03-30 10:54:21,743 INFO [main] mapreduce.ImportJobBase (ImportJobBase.java:runJob(184)) - Transferred 1.0645 KB in 13.3807 seconds (81.4604 bytes/sec) 16315 [main] INFO org.apache.sqoop.mapreduce.ImportJobBase - Retrieved 51 records. 2016-03-30 10:54:21,745 INFO [main] mapreduce.ImportJobBase (ImportJobBase.java:runJob(186)) - Retrieved 51 records. 16323 [main] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `area` AS t LIMIT 1 2016-03-30 10:54:21,753 INFO [main] manager.SqlManager (SqlManager.java:execute(757)) - Executing SQL statement: SELECT t.* FROM `area` AS t LIMIT 1 16328 [main] INFO org.apache.sqoop.hive.HiveImport - Loading uploaded data into Hive 2016-03-30 10:54:21,758 INFO [main] hive.HiveImport (HiveImport.java:importTable(195)) - Loading uploaded data into Hive Intercepting System.exit(1)
-----------------------------------------------
I searched the error from the internet, someone said "the hive-site.xml is missing, not in workflow.xml, or not correctly configured."
I have uploaded hive-site.xml in HDFS /tmp/
and added <file>/tmp/hive-site.xml#hive-site.xml</file>
but the error is still exist.
So what can I do next ?
Need someone can help me!!!
---------------------------------------
<workflow-app xmlns="uri:oozie:workflow:0.2" name="sqoop-wf"> <start to="sqoop-node"/> <action name="sqoop-node"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <prepare> <delete path="${nameNode}/user/${wf:user()}/area"/> </prepare> <configuration> <property> <name>mapred.job.queue.name</name> <value>${queueName}</value> </property> </configuration> <command>import --connect jdbc:mysql://cluster1.new:3306/crmdemo --username root --password xxxxx --table area --hive-import --hive-table default.area</command> <file>/tmp/hive-site.xml#hive-site.xml</file> </sqoop> <ok to="end"/> <error to="fail"/> </action> <kill name="fail"> <message>Sqoop failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end"/> </workflow-app>
---------------------------------------hive-site.xml
Created 03-31-2016 02:39 AM
Hi, I got the right result!!!!
The reason is hive-site.xml contained dirty config that can not be contained in the workflow.xml.
So we only need the basic config:
------------------------------------------
<property> <name>ambari.hive.db.schema.name</name> <value>hive</value> </property> <property> <name>hive.metastore.uris</name> <value>thrift://cluster2.new:9083</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/apps/hive/warehouse</value> </property> <property> <name>hive.zookeeper.quorum</name> <value>cluster2.new:2181,cluster3.new:2181,cluster1.new:2181,cluster4.new:2181,cluster5.new:2181</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://cluster1.new/hive?createDatabaseIfNotExist=true</value> </property>
------------------------------------------
Created 03-30-2016 09:02 AM
Is your cluster kerberized?
Created 03-31-2016 01:22 AM
No,
Created 03-30-2016 03:06 PM
have you verified that the sqoop command works by itself? run it manually on the command line outside of oozie.
are you running your workflow through hue or via the oozie command line? if through hue, try running it on the oozie command line to verify it works as well.
Created 03-31-2016 01:25 AM
I can run successful through the sqoop.
I ran oozie on the command line.
"oozie job -oozie http://xxxxxx:11000:oozie -config ./job.properties -run"
Created 03-31-2016 02:39 AM
Hi, I got the right result!!!!
The reason is hive-site.xml contained dirty config that can not be contained in the workflow.xml.
So we only need the basic config:
------------------------------------------
<property> <name>ambari.hive.db.schema.name</name> <value>hive</value> </property> <property> <name>hive.metastore.uris</name> <value>thrift://cluster2.new:9083</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/apps/hive/warehouse</value> </property> <property> <name>hive.zookeeper.quorum</name> <value>cluster2.new:2181,cluster3.new:2181,cluster1.new:2181,cluster4.new:2181,cluster5.new:2181</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://cluster1.new/hive?createDatabaseIfNotExist=true</value> </property>
------------------------------------------
Created 03-31-2016 03:01 AM
The hive-site.xml, workflow.xml, and job.properties files should all be copied to the same application (deployment) folder in HDFS. Supporting JAR files should be placed in a “lib” sub-directory under that same application folder. Looking at your notes above, it is not clear that this is the case. Can you verify?