Member since
09-17-2015
103
Posts
61
Kudos Received
18
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2364 | 06-15-2017 11:58 AM | |
2231 | 06-15-2017 09:18 AM | |
2944 | 06-09-2017 10:45 AM | |
1456 | 06-07-2017 03:52 PM | |
3190 | 01-06-2017 09:41 PM |
06-12-2017
11:37 AM
Hi @srinivas s if you're using rack awareness you should probably get rid of 50 datanodes by decomissioning them without loosing some blocks, otherwise you'll probably will. rebalance time depends on your network and cluster utilization, you can adjust some parameters to make it fastest if necessary, basically hdfs dfsadmin -setBalancerBandwidth <bandwidth (kb/s)> or within your HDFS params (example) : dfs.balance.bandwidthPerSec=100000000
dfs.datanode.max.transfer.threads=16384
dfs.datanode.balance.max.concurrent.moves=500 please check Accept if you're satisfied with my answer
... View more
06-09-2017
10:45 AM
1 Kudo
Hi, you have a "Hortonworks Sandbox Archive" just below "Hortonworks Sandbox in the Cloud"
... View more
06-07-2017
03:52 PM
1 Kudo
job.properties only contains properties to be propagated to your workflows, so small answer is yes you can have a single job.properties for your 20 workflows. Bundle and coordinators requires other parameters (like start/endDate) so it's better to have a specific job.properties for them
... View more
01-18-2017
02:17 PM
On RHEL/CentOS you might encounter an exception when trying to stop or restart Oozie : resource_management.core.exceptions.Fail: Execution of 'cd /var/tmp/oozie && /usr/hdp/current/oozie-server/bin/oozie-stop.sh' returned 1. -bash: line 0: cd: /var/tmp/oozie: No such file or directory This is likely because of a shell crontab /etc/cron.daily/tmpwatch which delete files/directories unmodified for 30d+ [root@local ~]# cat /etc/cron.daily/tmpwatch
#! /bin/sh
flags=-umc
/usr/sbin/tmpwatch "$flags" -x /tmp/.X11-unix -x /tmp/.XIM-unix \
-x /tmp/.font-unix -x /tmp/.ICE-unix -x /tmp/.Test-unix \
-X '/tmp/hsperfdata_*' 10d /tmp
/usr/sbin/tmpwatch "$flags" 30d /var/tmp
for d in /var/{cache/man,catman}/{cat?,X11R6/cat?,local/cat?}; do
if [ -d "$d" ]; then
/usr/sbin/tmpwatch "$flags" -f 30d "$d"
fi
done
Just recreate the directory and you're good to go [root@local ~]# mkdir /var/tmp/oozie
[root@local ~]# chown oozie:hadoop /var/tmp/oozie
[root@local ~]# chmod 755 /var/tmp/oozie
... View more
Labels:
01-06-2017
10:27 PM
You should consider running hadoop streaming using your python mapper and reducer. Take a look at https://oozie.apache.org/docs/4.2.0/WorkflowFunctionalSpec.html#a3.2.2.3_Streaming for an example of such that workflow Try first to execute your streaming directly with something like yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-streaming.jar -files mapper.py,reducer.py -mapper mapper.py -reducer reducer.py -input /user/theuser/input.csv -output /user/theuser/out
Then it'll be easier to schedule that with Oozie, worst case scenario you'll do a shell action with that command Please accept answer if I answered your question
... View more
01-06-2017
09:41 PM
@justlearning Oozie can't do mapreduce by itself, it's a Hadoop scheduler which launch workflows composed of jobs, which can be mapreduce. You here want to run a job defined by workflow.xml with parameters in job.properties, so the syntax is oozie job --oozie http://sandbox.hortonworks.com:11000/oozie -config job.properties -run
... View more
11-24-2016
05:52 PM
1 Kudo
@rama did you try increasing your mappers memory? and is your request failing with hive.execution.engine=mr as well?
... View more
10-23-2016
10:39 AM
@Pierre Villard I got it working with -D mapred.job.name=mySqoopTest
... View more
09-21-2016
03:16 PM
hi @Roberto Sancho , you can put 2 agents config into the same config file : basically you'll have agent1.sources = source1
agent1.channels = channel1
agent1.sinks = sink1
agent2.sources = source2
agent2.channels = channel2
agent2.sinks = sink2
... View more
09-20-2016
05:00 PM
you can update mount points by overriding in a new Ambari Config Group (then restart those nodes, of course, but you'll be notified) since HDFS will recreate the missing blocks on those 3 disks per node, you shouldn't have any loss
... View more