Member since
04-29-2016
9
Posts
2
Kudos Received
0
Solutions
08-15-2016
12:28 PM
Hi All, I am using hdp sandbox 2.3.4. I have created one oozie job and I am submitting spark job on yarn-cluster (--master yarn-cluster). workflow.xml looks as below <workflow-app name="sample" xmlns="uri:oozie:workflow:0.1">
<start to="spark-action" />
<action name="spark-action">
<spark xmlns="uri:oozie:spark-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<master>${master}</master>
<name>${csvProcessingJobName}</name>
<class>${csvProcessingJobClass}</class>
<jar>${jarName}</jar>
<arg>${csvProcessingArg1}</arg>
<arg>${csvProcessingArg2}</arg>
<arg>${csvProcessingArg3}</arg>
<arg>${csvProcessingArg4}</arg>
</spark>
<ok to="end" />
<error to="end" />
</action>
<end name = "end" />
</workflow-app>
job.properties ############ GENERAL HDFS AND ORACLE DB CONNECTION PROPERTIES ############
jobTracker=sandbox.hortonworks.com:8050
nameNode=hdfs://sandbox.hortonworks.com:8020
############ BUNDLE PROPERTIES ############
bundleAppName=bundle
bundleKickOffTime=2016-05-04T07:00Z
oozie.bundle.application.path=${nameNode}/user/root/oozie/config/spark/bundle.xml
oozie.use.system.libpath=true
############ COORDINATOR PROPERTIES ############
coordinatorAppPath=${nameNode}/user/root/oozie/config/spark/coordinator.xml
coordinatorAppName=csv-processing-coordinator
coordinatorStartTime=2016-05-04T01:00Z
coordinatorEndTime=2016-05-05T01:00Z
coordinatorFrequency=1440
coordinatorTimeZone=UTC
############ WORKFLOW PROPERTIES ############
workflowAppPath=${nameNode}/user/root/oozie/config/spark/workflow.xml
workflowAppName=rdbms-to-hadoop-workflow
master=yarn-cluster
jarName=${nameNode}/user/root/oozie/config/spark/processor.jar
csvProcessingJobName=processing-job
csvProcessingJobClass=com.job.SparkJob
csvProcessingArg1=${nameNode}/user/root/abc.csv
csvProcessingArg2=tableName
csvProcessingArg3=/apps/hive/warehouse/tableName
csvProcessingArg4=parquet
But I am getting below error in my mapreduce job. (checking log in job history UI) .. I am not sure what's the cause Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Call From sandbox.hortonworks.com/192.168.0.105 to 0.0.0.0:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
java.net.ConnectException: Call From sandbox.hortonworks.com/192.168.0.105 to 0.0.0.0:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor10.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1431)
at org.apache.hadoop.ipc.Client.call(Client.java:1358)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy15.getClusterMetrics(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:206)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy16.getClusterMetrics(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:501)
at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:129)
at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:129)
at org.apache.spark.Logging$class.logInfo(Logging.scala:58)
at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:62)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:128)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1065)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1125)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:104)
at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:95)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47)
at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:38)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:241)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:612)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:710)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:373)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1493)
at org.apache.hadoop.ipc.Client.call(Client.java:1397)
... 45 more
Oozie Launcher failed, finishing Hadoop job gracefully
Oozie Launcher, uploading action data to HDFS sequence file: hdfs://sandbox.hortonworks.com:8020/user/root/oozie-oozi/0000002-160815115226550-oozie-oozi-W/csv-processing-spark-action--spark/action-data.seq
Oozie Launcher ends Any idea ? Thanks
... View more
Labels:
- Labels:
-
Apache Oozie
-
Apache Spark
-
Apache YARN
05-29-2016
01:02 PM
Thanks that worked
... View more
05-29-2016
01:02 PM
Thanks, That worked
... View more
05-26-2016
08:41 AM
Hi All,
I am trying to do sqoop import with where condition/free form query with oozie and it is failing.
my oozie action looks like : <action name="sqoop-action">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${deleteHdfsPath}"/>
</prepare>
<configuration>
<property>
<name>oozie.hive.defaults</name>
<value>/usr/hdp/current/hive-client/conf/hive-site.xml</value>
</property>
</configuration>
<command>${command}</command>
</sqoop>
<ok to="ok" />
<error to="kill" />
</action> In properties file for command argument looks like this : import --connect jdbc:mysql://<>:3306/test --username hive --password hive --query "SELECT * FROM _table WHERE \$CONDITIONS AND _id > 0 AND _id <= 1000000" --split-by _id --fields-terminated-by \| --target-dir /apps/hive/warehouse/hive_table and I have tried with below also. import --connect jdbc:mysql://ip-172-31-5-150.ec2.internal:3306/test --username hive --password hive --table _table --where "_id > 0 AND _id <= 1000000" --fields-terminated-by \| --target-dir /apps/hive/warehouse/hive_table I am getting below error with both the way, it saying Unrecognized argument.
Sqoop command arguments :
import
--connect
jdbc:mysql://<>:3306/test
--username
hive
--password
hive
--query
SELECT
*
FROM
_table
WHERE
$CONDITIONS
AND
_id
>
0
AND
_id
<=
1000000
--split-by
_id
--fields-terminated-by
|
--target-dir
/apps/hive/warehouse/hive_table
Fetching child yarn jobs
tag id : oozie-fcc0762084b58ea3c408ef7887cc26a7
2016-05-25 23:58:33,814 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /172.31.5.150:8050
Child yarn jobs are found -
=================================================================
>>> Invoking Sqoop command line now >>>
3601 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
2016-05-25 23:58:34,126 WARN [main] tool.SqoopTool (SqoopTool.java:loadPluginsFromConfDir(177)) - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
3642 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.6.2.4.0.0-169
2016-05-25 23:58:34,167 INFO [main] sqoop.Sqoop (Sqoop.java:<init>(97)) - Running Sqoop version: 1.4.6.2.4.0.0-169
3667 [main] WARN org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead.
2016-05-25 23:58:34,192 WARN [main] tool.BaseSqoopTool (BaseSqoopTool.java:applyCredentialsOptions(1026)) - Setting your password on the command-line is insecure. Consider using -P instead.
3668 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Error parsing arguments for import:
2016-05-25 23:58:34,193 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(304)) - Error parsing arguments for import:
3669 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: *
2016-05-25 23:58:34,194 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: *
3669 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: FROM
2016-05-25 23:58:34,194 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: FROM
3669 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: _product
2016-05-25 23:58:34,194 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: _product
3669 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: WHERE
2016-05-25 23:58:34,194 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: WHERE
3670 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: $CONDITIONS
2016-05-25 23:58:34,195 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: $CONDITIONS
3670 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: AND
2016-05-25 23:58:34,195 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: AND
3671 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: _id
2016-05-25 23:58:34,196 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: _id
3671 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: >
2016-05-25 23:58:34,196 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: >
3671 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 0
2016-05-25 23:58:34,196 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: 1000000
3673 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: AND
2016-05-25 23:58:34,198 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: AND
3673 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: _id
2016-05-25 23:58:34,198 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: _id
3677 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: <=
2016-05-25 23:58:34,202 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: <=
3677 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 1000000
2016-05-25 23:58:34,202 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: 2000000
3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --split-by
2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: --split-by
3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: _id
2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: _id
3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --fields-terminated-by
2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: --fields-terminated-by
3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: |
2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: |
3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --target-dir
2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: --target-dir
3679 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: /apps/hive/warehouse/n_product_2
2016-05-25 23:58:34,204 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: /apps/hive/warehouse/n_product_2
Intercepting System.exit(1)
<<< Invocation of Main class completed <<< Sqoop command working fine in both the case, But running with oozie not working. All through simple sqoop action is working(without query - whole table fetch) fine with oozie.. but some how free form query is not working with oozie. Let me know, if I am missing something. Thanks in advance, Ankit
... View more
Labels:
- Labels:
-
Apache Oozie
-
Apache Sqoop
05-09-2016
06:01 PM
1 Kudo
Hi,
I am successfully able to do sqoop action from shell. But not able to do with oozie. I have read several question in hortonworks that states need to add mysql jar in oozie/lib. I have added that too.. but still no success.
Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]
My workflow.xml looks like <action name="sqoop-action">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${deleteHdfsPath}"/>
</prepare>
<configuration>
<property>
<name>oozie.hive.defaults</name>
<value>/usr/hdp/current/hive-client/conf/hive-site.xml</value>
</property>
</configuration>
<command>${command}</command>
<archive>mysql-connector-java.jar#mysql-connector-java.jar</archive>
</sqoop>
<ok to="hive-action" />
<error to="kill" />
</action>oozielog.zip
I am attaching my log file as well
Can anybody help me ?
... View more
Labels:
- Labels:
-
Apache Oozie
-
Apache Sqoop