Member since
08-30-2016
11
Posts
3
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
902 | 03-31-2017 07:11 AM | |
10622 | 08-30-2016 11:24 PM |
04-03-2018
10:34 AM
Hopefully, you would have solved it. But, for others: Configuration must be: hive.llap.daemon >= llap_heap_size + llap.io.memory.size + llap_headroom_space Which is not the case above.
... View more
04-01-2017
09:34 AM
So, it means that the zookeeper ensemble is not up. How many nodes you have in the zookeeper ? make sure the server.ip mapping and myid matches. paste your "zoo.cfg" here and netstat -tlpn | grep 2181
... View more
03-31-2017
07:11 AM
1 Kudo
FIFO was the default scheduler in Hadoop1, when you deploy Apache Hadoop vanilla version. It is not used in production, as only 1 job can run. But 1 job can have many containers and each node in the cluster can have few of them running. Size of the container has nothing to do with how many jobs are running on a cluster. It is decided by the map/reduce memory which asks for a container from YARN, which should be a multiple of yarn minimum allocation. There are lot of details here, but keeping it simple the size of container is not related to the number of containers. Yes, for mathematical calculations and to find how many containers can run on a node, we say that the number is equal to total memory avail for yarn/container memory. Size of a container is decided based on the request, whether it is map or reduce or spark task container etc.
... View more
03-31-2017
07:00 AM
Firstly, verify that the Zookeeper ensemble is up. Zookeeper daemon being up and running does not mean there is a "ensemble". Can you connect to zookeeper ? zkCli.sh -server localhost:2181 (Change to the address where it runs) [zk: localhost:2181(CONNECTED) 0] ls / Will list all znodes, can you see "rmstore" there ? you can delete it by rmr /rmstore Restart zookeeper and RM
... View more
02-01-2017
07:21 AM
In addition to the above, we can have HBase on S3 instead of HDFS, but for that we must use emrfs implementation. Keeping it simple, use EMR 5.2 and greater versions. But, still Namenode is mandatory.
... View more
10-11-2016
06:52 AM
@SaurabhSaurabh Yes, the script I gave was with "hadoop fs -ls" command, because many people do not understand what it does and they will simply copy the script, run it and then blame that they lost data. The problem is most people, call themselves Hadoop admins, but have never worked as Linux system admins/engineer 🙂
... View more
09-22-2016
12:51 PM
@Saurabh the script takes a argument as the number of days 🙂 So, if you want to look for files older then 10 days then #./cleaup.sh 10
... View more
08-30-2016
11:24 PM
1 Kudo
You can do:
#!/bin/bash
usage="Usage: dir_diff.sh [days]"
if [ ! "$1" ]
then
echo $usage
exit 1
fi
now=$(date +%s)
hadoop fs -ls -R /tmp/ | grep "^d" | while read f; do
dir_date=`echo $f | awk '{print $6}'`
difference=$(( ( $now - $(date -d "$dir_date" +%s) ) / (24 * 60 * 60 ) ))
if [ $difference -gt $1 ]; then
hadoop fs -rm -r `echo $f | awk '{ print $8 }'`;
fi
done
Replace the directories or files you need to clean up appropriately.
... View more