About KuldeepK

KuldeepK · ‎04-14-2016

Generally it should be visible on RM UI once you click on the unhealthy nodes. OR You can go the unhealthy node(http://<unhealthy-node-manager>:8042/jmx) and check JMX

KuldeepK · ‎04-14-2016

@Nilesh - yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage The maximum percentage of disk space utilization allowed after which a disk is marked as bad. Values can range from 0.0 to 100.0. If the value is greater than or equal to 100, the nodemanager will check for full disk. This applies to yarn-nodemanager.local-dirs and yarn.nodemanager.log-dirs.

KuldeepK · ‎04-14-2016

@Shashank Rai

KuldeepK · ‎04-14-2016

@vpoornalingam

KuldeepK · ‎04-14-2016

Here is an example of scheduling Oozie co-ordinator based on input data events. it starts Oozie workflow when input data is available. In this example coordinator will start at 2016-04-10, 6:00 GMT and will keep running till 2017-02-26, 23:25GMT (please note start and end time in xml file) start="2016-04-10T06:00Z" end="2017-02-26T23:25Z" timezone="GMT" Frequency is 1 day frequency="${coord:days(1)}" Below ETL function gives same value as start time which means coordinator will look for input data which has value same as start data in /user/root/output/YYYYMMDD format <instance>${coord:current(0)}</instance> Below are the working configuration files. coordinator.xml: <coordinator-app name="test" frequency="${coord:days(1)}" start="2016-04-10T06:00Z" end="2017-02-26T23:25Z" timezone="GMT" xmlns="uri:oozie:coordinator:0.2"> <datasets> <dataset name="inputdataset" frequency="${coord:days(1)}" initial-instance="2016-04-10T06:00Z" timezone="GMT"> <uri-template>${nameNode}/user/root/input/${YEAR}${MONTH}${DAY}</uri-template> <done-flag></done-flag> </dataset> <dataset name="outputdataset" frequency="${coord:days(1)}" initial-instance="2016-04-10T06:00Z" timezone="GMT"> <uri-template>${nameNode}/user/root/output/${YEAR}${MONTH}${DAY}</uri-template> <done-flag></done-flag> </dataset> </datasets> <input-events> <data-in name="inputevent" dataset="inputdataset"> <instance>${coord:current(0)}</instance> </data-in> </input-events> <output-events> <data-out name="outputevent" dataset="outputdataset"> <instance>${coord:current(0)}</instance> </data-out> </output-events> <action> <workflow> <app-path>${workflowAppUri}</app-path> <configuration> <property> <name>inputDir</name> <value>${coord:dataIn('inputevent')}</value> </property> <property> <name>outputDir</name> <value>${coord:dataOut('outputevent')}</value> </property> </configuration> </workflow> </action> </coordinator-app> workflow.xml <workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf"> <start to="shell-node"/> <action name="shell-node"> <shell xmlns="uri:oozie:shell-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <configuration> <property> <name>mapred.job.queue.name</name> <value>${queueName}</value> </property> </configuration> <exec>${myscript}</exec> <argument>${inputDir}</argument> <argument>${outputDir}</argument> <file>${myscriptPath}</file> <capture-output/> </shell> <ok to="end"/> <error to="fail"/> </action> <kill name="fail"> <message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <kill name="fail-output"> <message>Incorrect output, expected [Hello Oozie] but was [${wf:actionData('shell-node')['my_output']}]</message> </kill> <end name="end"/> </workflow-app> job.properties nameNode=hdfs://sandbox.hortonworks.com:8020 start=2016-04-12T06:00Z end=2017-02-26T23:25Z jobTracker=sandbox.hortonworks.com:8050 queueName=default examplesRoot=examples oozie.coord.application.path=${nameNode}/user/root workflowAppUri=${oozie.coord.application.path} myscript=myscript.sh myscriptPath=${oozie.wf.application.path}/myscript.sh myscript.sh #!/bin/bash echo "I'm receiving input as $1" > /tmp/output echo "I can store my output at $2" >> /tmp/output How to schedule this? 1. Edit above files as per your environment. 2. Validate your workflow.xml and cordinator.xml files using below command #oozie validate workflow.xml #oozie validate cordinator.xml 3. Upload your script and these xml files to oozie.coord.application.path and workflowAppUri mentioned in the job.properties 4. Submit coordinator using below command. oozie job -oozie http://<oozie-server>:11000/oozie -config $local/path/job.properties -run Note - You will see that some coordinator actions are in WAITING state, that's because they are still waiting for input data to be available on hdfs If you check /var/log/oozie.log and grep for WAITING coordinator actions: 2016-04-14 05:54:05,850 INFO CoordActionInputCheckXCommand:520 - SERVER[sandbox.hortonworks.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000038-160408193600784-oozie-oozi-C] ACTION[0000038-160408193600784-oozie-oozi-C@3] [0000038-160408193600784-oozie-oozi-C@3]::ActionInputCheck:: In checkListOfPaths: hdfs://sandbox.hortonworks.com:8020/user/root/input/20160412 is Missing. [..] 2016-04-14 05:54:15,601 INFO CoordActionInputCheckXCommand:520 - SERVER[sandbox.hortonworks.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000038-160408193600784-oozie-oozi-C] ACTION[0000038-160408193600784-oozie-oozi-C@4] [0000038-160408193600784-oozie-oozi-C@4]::ActionInputCheck:: In checkListOfPaths: hdfs://sandbox.hortonworks.com:8020/user/root/input/20160413 is Missing. On HDFS: [root@sandbox coord]# hadoop fs -ls /user/root/input/ Found 3 items -rw-r--r-- 3 root hdfs 0 2016-04-13 13:16 /user/root/input/20160410 drwxr-xr-x - root hdfs 0 2016-04-13 13:07 /user/root/input/20160411 Output: [root@sandbox coord]# cat /tmp/output I'm receiving input as hdfs://sandbox.hortonworks.com:8020/user/root/input/20160411 I can store my output at hdfs://sandbox.hortonworks.com:8020/user/root/output/20160411

KuldeepK · ‎04-13-2016

Thanks @Mayur Bhokase

KuldeepK · ‎04-13-2016

Thanks @jramakrishnan

KuldeepK · ‎04-13-2016

@Hazarathkumar bobba Please refer below link to setup master slave replication if you are using mysql. http://www.tecmint.com/how-to-setup-mysql-master-slave-replication-in-rhel-centos-fedora/ You can also write sample oozie/cron job which will take mysqldump and store to some archived location, same script should have login to purge older dumps.

KuldeepK · ‎04-13-2016

@Amit Tewari 1. Can you please check if resource manager is listening on 8050 e.g. [root@sandbox coord]# netstat -tulpn|grep 8050 tcp 0 0 0.0.0.0:8050 0.0.0.0:* LISTEN 8102/jav 2. If yes, can you please check if is reachable from hive client ? you can try simple ping command first and if ping is working then try to do telnet from hive client e.g. [root@sandbox coord]# telnet sandbox.hortonworks.com 8050 Trying 10.0.2.15... Connected to sandbox.hortonworks.com. Escape character is '^]'. 3. If you still get connection refused then try telnet from some other node in the cluster, if it works then your hive client has some connectivity issues with RM 4. If it doesn't work from any of the host then RM has some issues, you might need to check RM logs or RM host if any firewall is blocking packets to 8050. Hope this helps.

KuldeepK · ‎04-13-2016

@Nilesh There are 2 unhealthy nodes, if you click on "2" under unhealthy nodes section, you will get reason why they are unhealthy, it could be because of bad disk etc. Please check and try to check nodemanager's logs, you will get more info in there.

Online	Offline
Last Visited	‎04-07-2022 05:11 PM

Member Since	‎04-03-2019 04:03 PM
Last Visited	‎04-07-2022 05:11 PM
Posts	962
Kudos received	1733

Cloudera Community

Re: oozie shell action

Re: Oozie Service Check fails after upgrading to ...

Re: oozie - mr container fails to start on rhel6 n...

Re: Not able to run docker container on yarn even ...

Re: Oozie Pig action doesn't appear in Tez UI

Re: Mapreduce job hang, waiting for AM container t...

Re: Mapreduce job hang, waiting for AM container t...

Re: Is there a way to store service configuration ...

Re: Is there a way to store service configuration ...

Oozie coordinator and based on input data events

Re: How QJM Works in Namenode HA

Re: How QJM Works in Namenode HA

Re: What is the best way to Backup Ranger meta dat...

Re: Cannot connect to ResoureManager

Re: Mapreduce job hang, waiting for AM container t...