Member since
11-30-2014
3
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
24585 | 12-09-2014 12:54 AM |
12-09-2014
12:54 AM
Hello all, I solved the issue by increasing cores and adding one more datanode & node manager. Thanks.
... View more
12-01-2014
02:32 AM
Configuration of yarn and hdfs yarn-site.xml <?xml version="1.0" encoding="UTF-8"?>
<!--Autogenerated by Cloudera Manager-->
<configuration>
<property>
<name>yarn.acl.enable</name>
<value>true</value>
</property>
<property>
<name>yarn.admin.acl</name>
<value>*</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>cldx-1414-1259:8032</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>cldx-1414-1259:8033</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>cldx-1414-1259:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>cldx-1414-1259:8031</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>cldx-1414-1259:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address</name>
<value>cldx-1414-1259:8090</value>
</property>
<property>
<name>yarn.resourcemanager.client.thread-count</name>
<value>50</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.client.thread-count</name>
<value>50</value>
</property>
<property>
<name>yarn.resourcemanager.admin.client.thread-count</name>
<value>1</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.scheduler.increment-allocation-mb</name>
<value>512</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>2475</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
</property>
<property>
<name>yarn.scheduler.increment-allocation-vcores</name>
<value>1</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>4</value>
</property>
<property>
<name>yarn.resourcemanager.amliveliness-monitor.interval-ms</name>
<value>1000</value>
</property>
<property>
<name>yarn.am.liveness-monitor.expiry-interval-ms</name>
<value>600000</value>
</property>
<property>
<name>yarn.resourcemanager.am.max-attempts</name>
<value>2</value>
</property>
<property>
<name>yarn.resourcemanager.container.liveness-monitor.interval-ms</name>
<value>600000</value>
</property>
<property>
<name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
<value>1000</value>
</property>
<property>
<name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
<value>600000</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.client.thread-count</name>
<value>50</value>
</property>
<property>
<name>yarn.application.classpath</name>
<value>$HADOOP_CLIENT_CONF_DIR,$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.scheduler.fair.user-as-default-queue</name>
<value>true</value>
</property>
<property>
<name>yarn.scheduler.fair.preemption</name>
<value>false</value>
</property>
<property>
<name>yarn.scheduler.fair.sizebasedweight</name>
<value>false</value>
</property>
<property>
<name>yarn.scheduler.fair.assignmultiple</name>
<value>false</value>
</property>
<property>
<name>yarn.resourcemanager.max-completed-applications</name>
<value>10000</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value></value>
</property>
</configuration> mapred-site.xml <?xml version="1.0" encoding="UTF-8"?>
<!--Autogenerated by Cloudera Manager-->
<configuration>
<property>
<name>mapreduce.job.split.metainfo.maxsize</name>
<value>10000000</value>
</property>
<property>
<name>mapreduce.job.counters.max</name>
<value>120</value>
</property>
<property>
<name>mapreduce.output.fileoutputformat.compress</name>
<value>false</value>
</property>
<property>
<name>mapreduce.output.fileoutputformat.compress.type</name>
<value>BLOCK</value>
</property>
<property>
<name>mapreduce.output.fileoutputformat.compress.codec</name>
<value>org.apache.hadoop.io.compress.DefaultCodec</value>
</property>
<property>
<name>mapreduce.map.output.compress.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
<property>
<name>mapreduce.map.output.compress</name>
<value>true</value>
</property>
<property>
<name>zlib.compress.level</name>
<value>DEFAULT_COMPRESSION</value>
</property>
<property>
<name>mapreduce.task.io.sort.factor</name>
<value>64</value>
</property>
<property>
<name>mapreduce.map.sort.spill.percent</name>
<value>0.8</value>
</property>
<property>
<name>mapreduce.reduce.shuffle.parallelcopies</name>
<value>10</value>
</property>
<property>
<name>mapreduce.task.timeout</name>
<value>600000</value>
</property>
<property>
<name>mapreduce.client.submit.file.replication</name>
<value>1</value>
</property>
<property>
<name>mapreduce.job.reduces</name>
<value>2</value>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>256</value>
</property>
<property>
<name>mapreduce.map.speculative</name>
<value>false</value>
</property>
<property>
<name>mapreduce.reduce.speculative</name>
<value>false</value>
</property>
<property>
<name>mapreduce.job.reduce.slowstart.completedmaps</name>
<value>0.8</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>cldx-1414-1259:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>cldx-1414-1259:19888</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.https.address</name>
<value>cldx-1414-1259:19890</value>
</property>
<property>
<name>mapreduce.jobhistory.admin.address</name>
<value>cldx-1414-1259:10033</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/user</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.cpu-vcores</name>
<value>1</value>
</property>
<property>
<name>mapreduce.job.ubertask.enable</name>
<value>false</value>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Djava.net.preferIPv4Stack=true -Xmx825955249</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Djava.net.preferIPv4Stack=true -Xmx825955249</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Djava.net.preferIPv4Stack=true -Xmx825955249</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>1024</value>
</property>
<property>
<name>mapreduce.map.cpu.vcores</name>
<value>1</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>1024</value>
</property>
<property>
<name>mapreduce.reduce.cpu.vcores</name>
<value>1</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$MR2_CLASSPATH</value>
</property>
<property>
<name>mapreduce.admin.user.env</name>
<value>LD_LIBRARY_PATH=$HADOOP_COMMON_HOME/lib/native:$JAVA_LIBRARY_PATH</value>
</property>
<property>
<name>mapreduce.shuffle.max.connections</name>
<value>80</value>
</property>
</configuration> hdfs-site.xml <?xml version="1.0" encoding="UTF-8"?>
<!--Autogenerated by Cloudera Manager-->
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///dfs/nn</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address</name>
<value>cldx-1414-1259:8022</value>
</property>
<property>
<name>dfs.https.address</name>
<value>cldx-1414-1259:50470</value>
</property>
<property>
<name>dfs.https.port</name>
<value>50470</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>cldx-1414-1259:50070</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>134217728</value>
</property>
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>fs.permissions.umask-mode</name>
<value>022</value>
</property>
<property>
<name>dfs.namenode.acls.enabled</name>
<value>false</value>
</property>
<property>
<name>dfs.client.read.shortcircuit</name>
<value>false</value>
</property>
<property>
<name>dfs.domain.socket.path</name>
<value>/var/run/hdfs-sockets/dn</value>
</property>
<property>
<name>dfs.client.read.shortcircuit.skip.checksum</name>
<value>false</value>
</property>
<property>
<name>dfs.client.domain.socket.data.traffic</name>
<value>false</value>
</property>
<property>
<name>dfs.datanode.hdfs-blocks-metadata.enabled</name>
<value>true</value>
</property>
</configuration> core-site.xml <?xml version="1.0" encoding="UTF-8"?>
<!--Autogenerated by Cloudera Manager-->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://cldx-1414-1259:8020</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>1</value>
</property>
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.DeflateCodec,org.apache.hadoop.io.compress.SnappyCodec,org.apache.hadoop.io.compress.Lz4Codec</value>
</property>
<property>
<name>hadoop.security.authentication</name>
<value>simple</value>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>false</value>
</property>
<property>
<name>hadoop.rpc.protection</name>
<value>authentication</value>
</property>
<property>
<name>hadoop.security.auth_to_local</name>
<value>DEFAULT</value>
</property>
<property>
<name>hadoop.proxyuser.oozie.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.oozie.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.mapred.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.mapred.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.flume.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.flume.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.HTTP.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.HTTP.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hive.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hive.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hue.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.httpfs.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.httpfs.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hdfs.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hdfs.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.security.group.mapping</name>
<value>org.apache.hadoop.security.ShellBasedUnixGroupsMapping</value>
</property>
<property>
<name>hadoop.security.instrumentation.requires.admin</name>
<value>false</value>
</property>
<property>
<name>net.topology.script.file.name</name>
<value>/etc/hadoop/conf.cloudera.yarn/topology.py</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>65536</value>
</property>
<property>
<name>hadoop.ssl.enabled</name>
<value>false</value>
</property>
<property>
<name>hadoop.ssl.require.client.cert</name>
<value>false</value>
<final>true</final>
</property>
<property>
<name>hadoop.ssl.keystores.factory.class</name>
<value>org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory</value>
<final>true</final>
</property>
<property>
<name>hadoop.ssl.server.conf</name>
<value>ssl-server.xml</value>
<final>true</final>
</property>
<property>
<name>hadoop.ssl.client.conf</name>
<value>ssl-client.xml</value>
<final>true</final>
</property>
</configuration> ssl-client.xml <?xml version="1.0" encoding="UTF-8"?>
<!--Autogenerated by Cloudera Manager-->
<configuration>
<property>
<name>ssl.client.truststore.type</name>
<value>jks</value>
</property>
<property>
<name>ssl.client.truststore.reload.interval</name>
<value>10000</value>
</property>
</configuration> Cluster disk info: bash-4.1$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 54G 20G 32G 39% /
tmpfs 7.8G 0 7.8G 0% /dev/shm Cluster RAM info: total used free shared buffers cached
Mem: 15 9 5 0 0 3
-/+ buffers/cache: 5 9
Swap: 3 0 3 Cluster CPU info: bash-4.1$ lsb_release -a
LSB Version: :core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: RedHatEnterpriseServer
Description: Red Hat Enterprise Linux Server release 6.2 (Santiago)
Release: 6.2
Codename: Santiago
bash-4.1$
bash-4.1$ uname -a
Linux cldx-1414-1259 2.6.32-220.el6.x86_64 #1 SMP Wed Nov 9 08:03:13 EST 2011 x86_64 x86_64 x86_64 GNU/Linux Please let me know what changes has to be done (in configuration or any other place).
... View more
12-01-2014
02:28 AM
We have one node cluster of CHD 5.2 with following services running on it. HDFS Hive Impala Oozie Sqoop 1 Client YARN (MR2 Included) ZooKeeper Task to perform: 1. We want to connect to MySQL DB and fetch the data in HDFS using sqoop (independently using console) 2. Above mentioned task has to be triggered using Oozie. We were succesful in doing first task i.e by running the sqoop job independently on console. Problem statement: Sqoop action in Oozie gives heart beat in log for long time and oozie job/ sqoop map reduce job is in running state for long time. Scenario of cluster: 1. All services are up and running (tested individual service) 2. No application running. (no mapreduce job running) But with the task 2, no output is generated. Oozie job is in running state for long time. Also the Yarn application summary shows 1 Apps pending, 1 Apps Running for long time. Approach 1: -->oozie-sqoop_action.properties:
nameNode=hdfs://cldx-1414-1259:8020
jobTracker=cldx-1414-1259:8032
queueName=default
user.name=hdfs
oozie.use.system.libpath=true
outputDirPath=hdfs://cldx-1414-1259:8020/sqoopTest27Nov_clusterRestored/oozieImport/
sqoop_command=import --connect jdbc:mysql://172.25.38.161/test --username root --password root -m 1 --table CONTACTS --target-dir hdfs://cldx-1414-1259:8020/sqoopTest27Nov_clusterRestored/oozieImport/
oozie.wf.application.path=${nameNode}/user/hdfs/oozie-sqoop_action.xml
-->oozie-sqoop_action.xml:
<workflow-app xmlns='uri:oozie:workflow:0.1' name='Sqoop Action XML'>
<start to='SqoopAction' />
<action name="SqoopAction">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${outputDirPath}"/>
</prepare>
<configuration>
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
</configuration>
<command>${sqoop_command}</command>
</sqoop>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>Sqoop failed, error
message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name='end' />
</workflow-app>
-->log:
Container log:
2014-11-28 23:55:42,345 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1417108420351_0024_000001
2014-11-28 23:55:45,081 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert; Ignoring.
2014-11-28 23:55:45,090 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2014-11-28 23:55:45,092 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf; Ignoring.
2014-11-28 23:55:45,099 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class; Ignoring.
2014-11-28 23:55:45,109 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf; Ignoring.
2014-11-28 23:55:45,148 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2014-11-28 23:55:45,670 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
2014-11-28 23:55:45,670 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@3adde4f2)
2014-11-28 23:55:46,136 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: RM_DELEGATION_TOKEN, Service: 172.25.7.91:8032, Ident: (owner=hdfs, renewer=oozie mr token, realUser=oozie, issueDate=1417199124140, maxDate=1417803924140, sequenceNumber=101, masterKeyId=3)
2014-11-28 23:55:46,773 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert; Ignoring.
2014-11-28 23:55:46,779 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2014-11-28 23:55:46,781 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf; Ignoring.
2014-11-28 23:55:46,783 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class; Ignoring.
2014-11-28 23:55:46,790 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf; Ignoring.
2014-11-28 23:55:46,815 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2014-11-28 23:55:50,206 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config null
2014-11-28 23:55:50,211 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.mapred.FileOutputCommitter
2014-11-28 23:55:50,522 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
2014-11-28 23:55:50,532 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
2014-11-28 23:55:50,547 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
2014-11-28 23:55:50,566 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
2014-11-28 23:55:50,573 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
2014-11-28 23:55:50,588 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
2014-11-28 23:55:50,602 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
2014-11-28 23:55:50,615 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
2014-11-28 23:55:51,853 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
2014-11-28 23:55:55,752 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2014-11-28 23:55:56,563 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2014-11-28 23:55:56,569 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started
2014-11-28 23:55:56,733 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1417108420351_0024 to jobTokenSecretManager
2014-11-28 23:55:58,180 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1417108420351_0024 because: not enabled;
2014-11-28 23:55:58,434 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1417108420351_0024 = 0. Number of splits = 1
2014-11-28 23:55:58,435 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1417108420351_0024 = 0
2014-11-28 23:55:58,436 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1417108420351_0024Job Transitioned from NEW to INITED
2014-11-28 23:55:58,447 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1417108420351_0024.
2014-11-28 23:55:59,470 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2014-11-28 23:55:59,759 INFO [Socket Reader #1 for port 38852] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 38852
2014-11-28 23:56:00,011 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
2014-11-28 23:56:00,063 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2014-11-28 23:56:00,070 INFO [IPC Server listener on 38852] org.apache.hadoop.ipc.Server: IPC Server listener on 38852: starting
2014-11-28 23:56:00,089 INFO [main] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated MRClientService at cldx-1414-1259/172.25.7.91:38852
2014-11-28 23:56:01,011 INFO [main] org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2014-11-28 23:56:01,063 INFO [main] org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.mapreduce is not defined
2014-11-28 23:56:01,243 INFO [main] org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2014-11-28 23:56:01,299 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context mapreduce
2014-11-28 23:56:01,300 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context static
2014-11-28 23:56:01,341 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /mapreduce/*
2014-11-28 23:56:01,344 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
2014-11-28 23:56:01,587 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 56301
2014-11-28 23:56:01,588 INFO [main] org.mortbay.log: jetty-6.1.26
2014-11-28 23:56:02,085 INFO [main] org.mortbay.log: Extract jar:file:/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/jars/hadoop-yarn-common-2.5.0-cdh5.2.0.jar!/webapps/mapreduce to /tmp/Jetty_0_0_0_0_56301_mapreduce____.ai0e56/webapp
2014-11-28 23:56:07,915 INFO [main] org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:56301
2014-11-28 23:56:07,923 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce started at 56301
2014-11-28 23:56:10,911 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
2014-11-28 23:56:10,928 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2014-11-28 23:56:10,953 INFO [Socket Reader #1 for port 40428] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 40428
2014-11-28 23:56:10,970 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2014-11-28 23:56:10,980 INFO [IPC Server listener on 40428] org.apache.hadoop.ipc.Server: IPC Server listener on 40428: starting
2014-11-28 23:56:11,055 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
2014-11-28 23:56:11,055 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3
2014-11-28 23:56:11,055 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33
2014-11-28 23:56:11,311 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert; Ignoring.
2014-11-28 23:56:11,322 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2014-11-28 23:56:11,323 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.client.conf; Ignoring.
2014-11-28 23:56:11,324 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.keystores.factory.class; Ignoring.
2014-11-28 23:56:11,326 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.ssl.server.conf; Ignoring.
2014-11-28 23:56:11,339 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2014-11-28 23:56:11,371 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at cldx-1414-1259/172.25.7.91:8030
2014-11-28 23:56:11,715 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: maxContainerCapability: <memory:2475, vCores:4>
2014-11-28 23:56:11,715 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: queue: root.hdfs
2014-11-28 23:56:11,735 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper limit on the thread pool size is 500
2014-11-28 23:56:11,737 INFO [main] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: yarn.client.max-nodemanagers-proxies : 500
2014-11-28 23:56:11,810 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1417108420351_0024Job Transitioned from INITED to SETUP
2014-11-28 23:56:11,831 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP
2014-11-28 23:56:12,017 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1417108420351_0024Job Transitioned from SETUP to RUNNING
2014-11-28 23:56:12,163 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1417108420351_0024_m_000000 Task Transitioned from NEW to SCHEDULED
2014-11-28 23:56:12,208 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1417108420351_0024_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2014-11-28 23:56:12,236 INFO [Thread-51] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceRequest:<memory:1024, vCores:1>
2014-11-28 23:56:12,650 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1417108420351_0024, File: hdfs://cldx-1414-1259:8020/user/hdfs/.staging/job_1417108420351_0024/job_1417108420351_0024_1.jhist
2014-11-28 23:56:12,739 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0
2014-11-28 23:56:12,948 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1417108420351_0024: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:1451, vCores:3> knownNMs=1
2014-11-28 23:56:17,112 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 1
2014-11-28 23:56:17,429 INFO [RMCommunicator Allocator] org.apache.hadoop.yarn.util.RackResolver: Resolved cldx-1414-1259 to /default
2014-11-28 23:56:17,446 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1417108420351_0024_01_000002 to attempt_1417108420351_0024_m_000000_0
2014-11-28 23:56:17,466 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2014-11-28 23:56:17,920 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved cldx-1414-1259 to /default
2014-11-28 23:56:17,924 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Job jar is not present. Not adding any jar to the list of resources.
2014-11-28 23:56:18,275 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf file on the remote FS is /user/hdfs/.staging/job_1417108420351_0024/job.xml
2014-11-28 23:56:20,314 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/hdfs/lib/ST4-4.0.4.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/oozie/share/lib/lib_20141121170802/sqoop/ST4-4.0.4.jar This will be an error in Hadoop 2.0
2014-11-28 23:56:20,336 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/hdfs/lib/activation-1.1.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/oozie/share/lib/lib_20141121170802/sqoop/activation-1.1.jar This will be an error in Hadoop 2.0
org.apache.hadoop.mapreduce.v2.util.MRApps: cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/hdfs/lib/snappy-java-1.0.4.1.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/oozie/share/lib/lib_20141121170802/sqoop/snappy-java-1.0.4.1.jar This will be an error in Hadoop 2.0
2014-11-28 23:56:21,288 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/hdfs/lib/sqoop-1.4.5-cdh5.2.0.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/oozie/share/lib/lib_20141121170802/sqoop/sqoop-1.4.5-cdh5.2.0.jar This will be an error in Hadoop 2.0
2014-11-28 23:56:21,294 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/hdfs/lib/stringtemplate-3.2.1.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/oozie/share/lib/lib_20141121170802/sqoop/stringtemplate-3.2.1.jar This will be an error in Hadoop 2.0
2014-11-28 23:56:21,297 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/hdfs/lib/xz-1.0.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/oozie/share/lib/lib_20141121170802/sqoop/xz-1.0.jar This will be an error in Hadoop 2.0
2014-11-28 23:56:21,300 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/hdfs/lib/json-simple-1.1.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/oozie/share/lib/lib_20141121170802/oozie/json-simple-1.1.jar This will be an error in Hadoop 2.0
2014-11-28 23:56:21,302 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/hdfs/lib/oozie-hadoop-utils-2.5.0-cdh5.2.0.oozie-4.0.0-cdh5.2.0.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/oozie/share/lib/lib_20141121170802/oozie/oozie-hadoop-utils-2.5.0-cdh5.2.0.oozie-4.0.0-cdh5.2.0.jar This will be an error in Hadoop 2.0
2014-11-28 23:56:21,305 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/hdfs/lib/oozie-sharelib-oozie-4.0.0-cdh5.2.0.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://cldx-1414-1259:8020/user/oozie/share/lib/lib_20141121170802/oozie/oozie-sharelib-oozie-4.0.0-cdh5.2.0.jar This will be an error in Hadoop 2.0
2014-11-28 23:56:21,306 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #1 tokens and #1 secret keys for NM use for launching container
2014-11-28 23:56:21,306 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of containertokens_dob is 2
2014-11-28 23:56:21,306 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Putting shuffle token in serviceData
2014-11-28 23:56:22,565 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1417108420351_0024_m_000000_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED
2014-11-28 23:56:22,573 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1417108420351_0024: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:427, vCores:2> knownNMs=1
2014-11-28 23:56:22,588 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1417108420351_0024_01_000002 taskAttempt attempt_1417108420351_0024_m_000000_0
2014-11-28 23:56:22,604 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1417108420351_0024_m_000000_0
2014-11-28 23:56:22,610 INFO [ContainerLauncher #0] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: Opening proxy : cldx-1414-1259:8041
2014-11-28 23:56:23,297 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1417108420351_0024_m_000000_0 : 13562
2014-11-28 23:56:23,305 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1417108420351_0024_m_000000_0] using containerId: [container_1417108420351_0024_01_000002 on NM: [cldx-1414-1259:8041]
2014-11-28 23:56:23,314 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1417108420351_0024_m_000000_0 TaskAttempt Transitioned from ASSIGNED to RUNNING
2014-11-28 23:56:23,323 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1417108420351_0024_m_000000 Task Transitioned from SCHEDULED to RUNNING
2014-11-28 23:56:28,300 INFO [Socket Reader #1 for port 40428] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1417108420351_0024 (auth:SIMPLE)
2014-11-28 23:56:28,341 INFO [IPC Server handler 0 on 40428] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1417108420351_0024_m_000002 asked for a task
2014-11-28 23:56:28,342 INFO [IPC Server handler 0 on 40428] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1417108420351_0024_m_000002 given task: attempt_1417108420351_0024_m_000000_0
2014-11-28 23:56:38,162 INFO [IPC Server handler 1 on 40428] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1417108420351_0024_m_000000_0 is : 1.0
2014-11-28 23:57:05,823 INFO [IPC Server handler 6 on 40428]
Approach 2: -->oozie-sqoop_action.properties:
nameNode=hdfs://cldx-1414-1259:8020
jobTracker=cldx-1414-1259:8032
user.name=oozie
#oozie.use.system.libpath=true
#import-dirSqoop=${nameNode}/sqoop/oozieImport/
sqoop_command=import --connect jdbc:mysql://172.25.38.161/test --username root --password root -m 1 --table CONTACTS --target-dir hdfs://cldx-1414-1259:8020/sqoopOozieTest/oozieImport/
oozie.wf.application.path=${nameNode}/user/oozie/oozie-sqoop_action.xml
-->oozie-sqoop_action.xml:
same as above
-->log:
Showing 4096 bytes. Click here for full log
ad.bytes=4193404
mapreduce.job.ubertask.maxreduces=1
dfs.image.compress=false
mapreduce.shuffle.ssl.enabled=false
yarn.log-aggregation-enable=false
mapreduce.tasktracker.report.address=127.0.0.1:0
mapreduce.tasktracker.http.threads=40
dfs.stream-buffer-size=4096
tfile.fs.output.buffer.size=262144
fs.permissions.umask-mode=022
dfs.client.datanode-restart.timeout=30
yarn.resourcemanager.am.max-attempts=2
ha.failover-controller.graceful-fence.connection.retries=1
hadoop.proxyuser.hdfs.groups=*
dfs.datanode.drop.cache.behind.writes=false
hadoop.proxyuser.HTTP.hosts=*
hadoop.common.configuration.version=0.23.0
mapreduce.job.ubertask.enable=false
yarn.app.mapreduce.am.resource.cpu-vcores=1
dfs.namenode.replication.work.multiplier.per.iteration=2
mapreduce.job.acl-modify-job=
io.seqfile.local.dir=${hadoop.tmp.dir}/io/local
fs.s3.sleepTimeSeconds=10
mapreduce.client.output.filter=FAILED
------------------------
Sqoop command arguments :
import
--connect
jdbc:mysql://172.25.38.161/test
--username
root
--password
root
-m
1
--table
CONTACTS
--target-dir
hdfs://cldx-1414-1259:8020/sqoopOozieTest/oozieImport/
=================================================================
>>> Invoking Sqoop command line now >>>
27345 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
27456 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.5-cdh5.2.0
27558 [main] WARN org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead.
27586 [main] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
27941 [main] INFO org.apache.sqoop.manager.SqlManager - Using default fetchSize of 1000
27941 [main] INFO org.apache.sqoop.tool.CodeGenTool - Beginning code generation
28601 [main] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `CONTACTS` AS t LIMIT 1
28666 [main] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `CONTACTS` AS t LIMIT 1
28670 [main] INFO org.apache.sqoop.orm.CompilationManager - HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop-mapreduce
42076 [main] INFO org.apache.sqoop.orm.CompilationManager - Writing jar file: /tmp/sqoop-yarn/compile/61d055b020bc7e036b6f8a30ac787f18/CONTACTS.jar
42088 [main] WARN org.apache.sqoop.manager.MySQLManager - It looks like you are importing from mysql.
42089 [main] WARN org.apache.sqoop.manager.MySQLManager - This transfer can be faster! Use the --direct
42089 [main] WARN org.apache.sqoop.manager.MySQLManager - option to exercise a MySQL-specific fast path.
42089 [main] INFO org.apache.sqoop.manager.MySQLManager - Setting zero DATETIME behavior to convertToNull (mysql)
42100 [main] INFO org.apache.sqoop.mapreduce.ImportJobBase - Beginning import of CONTACTS
42472 [main] WARN org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able to find all job dependencies.
45670 [main] INFO org.apache.sqoop.mapreduce.db.DBInputFormat - Using read commited transaction isolation
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat Approach 3: -->oozie-sqoop_action.properties:
nameNode=hdfs://cldx-1414-1259:8020
jobTracker=cldx-1414-1259:8032
user.name=oozie
oozie.use.system.libpath=true
#import-dirSqoop=${nameNode}/sqoop/oozieImport/
sqoop_command=import --connect jdbc:mysql://172.25.38.161/test --username root --password root -m 1 --table CONTACTS --target-dir hdfs://cldx-1414-1259:8020/sqoopOozieTest/oozieImport/
oozie.wf.application.path=${nameNode}/user/oozie/oozie-sqoop_action.xml
-->oozie-sqoop_action.xml:
same as above
-->log:
Showing 4096 bytes. Click here for full log
che-size=10000
mapreduce.job.hdfs-servers=${fs.defaultFS}
yarn.application.classpath=$HADOOP_CLIENT_CONF_DIR,$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*
dfs.datanode.hdfs-blocks-metadata.enabled=true
mapreduce.tasktracker.dns.nameserver=default
dfs.datanode.readahead.bytes=4193404
mapreduce.job.ubertask.maxreduces=1
dfs.image.compress=false
mapreduce.shuffle.ssl.enabled=false
yarn.log-aggregation-enable=false
mapreduce.tasktracker.report.address=127.0.0.1:0
mapreduce.tasktracker.http.threads=40
dfs.stream-buffer-size=4096
tfile.fs.output.buffer.size=262144
fs.permissions.umask-mode=022
dfs.client.datanode-restart.timeout=30
yarn.resourcemanager.am.max-attempts=2
ha.failover-controller.graceful-fence.connection.retries=1
hadoop.proxyuser.hdfs.groups=*
dfs.datanode.drop.cache.behind.writes=false
hadoop.proxyuser.HTTP.hosts=*
hadoop.common.configuration.version=0.23.0
mapreduce.job.ubertask.enable=false
yarn.app.mapreduce.am.resource.cpu-vcores=1
dfs.namenode.replication.work.multiplier.per.iteration=2
mapreduce.job.acl-modify-job=
io.seqfile.local.dir=${hadoop.tmp.dir}/io/local
fs.s3.sleepTimeSeconds=10
mapreduce.client.output.filter=FAILED
------------------------
Sqoop command arguments :
import
--connect
jdbc:mysql://172.25.38.161/test
--username
root
--password
root
-m
1
--table
CONTACTS
--target-dir
hdfs://cldx-1414-1259:8020/sqoopOozieTest/oozieImport/
=================================================================
>>> Invoking Sqoop command line now >>>
50298 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
50938 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.5-cdh5.2.0
51098 [main] WARN org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead.
51221 [main] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
53116 [main] INFO org.apache.sqoop.manager.SqlManager - Using default fetchSize of 1000
53117 [main] INFO org.apache.sqoop.tool.CodeGenTool - Beginning code generation
55470 [main] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `CONTACTS` AS t LIMIT 1
55579 [main] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `CONTACTS` AS t LIMIT 1
55589 [main] INFO org.apache.sqoop.orm.CompilationManager - HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop-mapreduce
Heart beat
77776 [main] INFO org.apache.sqoop.orm.CompilationManager - Writing jar file: /tmp/sqoop-yarn/compile/1c3dc9b68c0dae79669a15ee8b86da92/CONTACTS.jar
77792 [main] WARN org.apache.sqoop.manager.MySQLManager - It looks like you are importing from mysql.
77793 [main] WARN org.apache.sqoop.manager.MySQLManager - This transfer can be faster! Use the --direct
77793 [main] WARN org.apache.sqoop.manager.MySQLManager - option to exercise a MySQL-specific fast path.
77794 [main] INFO org.apache.sqoop.manager.MySQLManager - Setting zero DATETIME behavior to convertToNull (mysql)
77815 [main] INFO org.apache.sqoop.mapreduce.ImportJobBase - Beginning import of CONTACTS
77960 [main] WARN org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able to find all job dependencies.
83892 [main] INFO org.apache.sqoop.mapreduce.db.DBInputFormat - Using read commited transaction isolation
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat HDFS browsed dir: Required jars for oozie and sqoop is kept in /user/hdfs with share lib created as well. The dir structure for /user/hdfs and /user/oozie is same.
... View more