Member since
11-20-2014
4
Posts
0
Kudos Received
0
Solutions
02-24-2015
09:19 AM
i meant distcp command
... View more
02-24-2015
09:18 AM
I found only ways to migrate data beetween clusters with dfscopy command.
... View more
02-24-2015
09:08 AM
we have a Cloudera 5 installation based on one single node on a single server. Before adding 2 additional nodes on the cluster, we want to increase the size of the partition using a fresh new disk. We have the following services installed: yarn with 1 NodeManager 1 JobHistory and 1 ResourceManager hdfs with 1 datanode 1 primary node and 1 secondary node hbase with 1 master and 1 regionserver zookeeper with 1 server All data is currently installed on a partition. The number of data that will be collected has increased so we need to use another disk where store all the information. All the data are under a partition mounted into the folder /dfs The working partition is: df -h hadoop-dfs-partition 119G 9.8G 103G 9% /dfs df -i hadoop-dfs-partition 7872512 18098 7854414 1% /dfs the content of this folder is the following: drwxr-xr-x 11 root root 4096 May 8 2014 dfs drwx------. 2 root root 16384 May 7 2014 lost+found drwxr-xr-x 5 root root 4096 May 8 2014 yarn under dfs there are these folders: drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 dn drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 dn1 drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 dn2 drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 nn drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 nn1 drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 nn2 drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 snn drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 snn1 drwx------ 3 hdfs hadoop 4096 Feb 23 18:14 snn2 under yarn there are these folders: drwxr-xr-x 9 yarn hadoop 4096 Nov 9 15:46 nm drwxr-xr-x 9 yarn hadoop 4096 Nov 9 15:46 nm1 drwxr-xr-x 9 yarn hadoop 4096 Nov 9 15:46 nm2 How can we achieve this? I found only ways to migrate data beetween clusters with dfscopy command. Didn't find any way to move raw data. Stopping all services and shutting down the entire cluster before performing a cp -Rp /dfs/* /dfs-new/ command is a viable option? (/dfs-new in the folder where the fresh new ext4 partition of the new disk is mounted) Any better way of doing this? Thank you in advance
... View more
11-20-2014
01:18 PM
Hi, I'm experiencing problems installing CDH 5.2.0 Express via Cloudera Manager. In short Processors become far too busy on the node on which I started the installation right after its Step 4 (Cluster Installation) completes. In detail 3 VMware nodes (say, namenode, datanode01, datanode02), each of which has: - 2 processors, 4GB of RAM, 50GB of disk reserved, - a fresh Ubuntu 14.04 LTS Server installation (only OpenSSH as an extra package). Physical machine is a 4-core 16GB of RAM Windows 7 Professional box. I tend to exclude that the number of cores is the problem (4 physical vs. 2 + 2 + 2 virtual + something for the host), since the processors on the two datanodes are nearly idle. Network appears to be properly configured: each node reaches the others, both DNS and reverse DNS queries succeed. I didn't disable IPV6, though. Here's an excerpt from my hosts file (there should be no need to populate this file when using DNS, but just in case...): ... 192.168.0.70 namenode.my.domain namenode 192.168.0.71 datanode01.my.domain datanode01 192.168.0.72 datanode02.my.domain datanode02 ... I start the installation on namenode (launching cloudera-manager-installer.bin), then I proceed on Cloudera Manager, find the nodes by their DNS names and basically accept the proposed options. At Step 4 (Cluster Installation) the task on namenode completes before the tasks on the datanodes and, as soon as it completes, processors on that node start to get really busy. They breath sometimes, but stay basically busy. Even typing a command is a pain (when at all possible). Meanwhile, the tasks for the datanodes complete. Only once did I have the heart to proceed to the next step and, after a few hours, the distribution tasks succeeded only for the datanodes. I think the namenode was too irresponsive to complete the task. The two processes that tie the processors up are: /usr/lib/cmf/agent/build/env/bin/python /usr/lib/cmf/agent/src/cmf/agent.py --package_dir /usr/lib/cmf/service --agent_dir /var/run/cloudera-scm-agent --lib_dir /var/lib/cloudera-scm-agent --logfile /var/log/cloudera-scm-agent/cloudera-scm-agent.log and /usr/lib/jvm/java-7-oracle-cloudera/bin/java -cp .:lib/*:/usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar -server -Dlog4j.configuration=file:/etc/cloudera-scm-server/log4j.properties -Dfile.encoding=UTF-8 -Dcmf.root.logger=INFO,LOGFILE -Dcmf.log.dir=/var/log/cloudera-scm-server -Dcmf.log.file=cloudera-scm-server.log -Dcmf.jetty.threshhold=WARN -Dcmf.schema.dir=/usr/share/cmf/schema -Djava.awt.headless=true -Djava.net.preferIPv4Stack=true -Dpython.home=/usr/share/cmf/python -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:+UseParNewGC -XX:+HeapDumpOnOutOfMemoryError -Xmx2G -XX:MaxPermSize=256m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp -XX:OnOutOfMemoryError=kill -9 %p com.cloudera.server.cmf.Main Outstanding entries in SCM Agent log [20/Nov/2014 21:25:44 +0000] 6020 Monitor-HostMonitor throttling_logger ERROR (298 skipped) Failed to collect NTP metrics Traceback (most recent call last): File "/usr/lib/cmf/agent/src/cmf/monitor/host/ntp_monitor.py", line 39, in collect result, stdout, stderr = self._subprocess_with_timeout(args, self._timeout) File "/usr/lib/cmf/agent/src/cmf/monitor/host/ntp_monitor.py", line 32, in _subprocess_with_timeout return subprocess_with_timeout(args, timeout) File "/usr/lib/cmf/agent/src/cmf/monitor/host/subprocess_timeout.py", line 40, in subprocess_with_timeout close_fds=True) File "/usr/lib/python2.7/subprocess.py", line 710, in __init__ errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory [20/Nov/2014 21:27:27 +0000] 6020 Monitor-HostMonitor throttling_logger ERROR (145 skipped) Timeout with args ['/usr/lib/jvm/java-7-oracle-cloudera/bin/java', '-classpath', '/usr/share/cmf/lib/agent-5.2.0.jar', 'com.cloudera.cmon.agent.DnsTest'] None [20/Nov/2014 21:27:28 +0000] 6020 Monitor-HostMonitor throttling_logger ERROR (145 skipped) Failed to collect java-based DNS names Traceback (most recent call last): File "/usr/lib/cmf/agent/src/cmf/monitor/host/dns_names.py", line 64, in collect result, stdout, stderr = self._subprocess_with_timeout(args, self._poll_timeout) File "/usr/lib/cmf/agent/src/cmf/monitor/host/dns_names.py", line 46, in _subprocess_with_timeout return subprocess_with_timeout(args, timeout) File "/usr/lib/cmf/agent/src/cmf/monitor/host/subprocess_timeout.py", line 81, in subprocess_with_timeout raise Exception("timeout with args %s" % args) Exception: timeout with args ['/usr/lib/jvm/java-7-oracle-cloudera/bin/java', '-classpath', '/usr/share/cmf/lib/agent-5.2.0.jar', 'com.cloudera.cmon.agent.DnsTest'] [20/Nov/2014 21:33:34 +0000] 6020 Monitor-HostMonitor throttling_logger ERROR (3 skipped) Kill subprocess exception with args ['/usr/lib/jvm/java-7-oracle-cloudera/bin/java', '-classpath', '/usr/share/cmf/lib/agent-5.2.0.jar', 'com.cloudera.cmon.agent.DnsTest'] Traceback (most recent call last): File "/usr/lib/cmf/agent/src/cmf/monitor/host/subprocess_timeout.py", line 71, in subprocess_with_timeout os.kill(p.pid, signal.SIGTERM) OSError: [Errno 3] No such process [20/Nov/2014 21:45:04 +0000] 6020 Monitor-HostMonitor filesystem_map WARNING Failed to join worker process collecting filesystem usage. All nodev filesystems will have unknown usage until the worker process is no longer active. Current nodev filesystems: /sys/fs/cgroup,/run,/run/lock,/run/shm,/run/user,/run/cloudera-scm-agent/process [20/Nov/2014 21:45:04 +0000] 6020 Monitor-HostMonitor filesystem_map WARNING Failed to join worker process collecting filesystem usage. All nodev filesystems will have unknown usage until the worker process is no longer active. Current nodev filesystems: /sys/fs/cgroup,/run,/run/lock,/run/shm,/run/user,/run/cloudera-scm-agent/process I installed and configured ntpd (I admit I didn't before), but nothing changed. Outstanding entries in SCM Agent log 2014-11-20 22:05:02,769 INFO JvmPauseMonitor:com.cloudera.enterprise.debug.JvmPauseMonitor: Detected pause in JVM or host machine (e.g. a stop the world GC, or JVM not scheduled): paused approximately 1396ms: no GCs detected. (there are hundreds of these) I found a post about a similar issue on an already running system, but it is quite old and talked about a recognized and solved (or soon to be, at the time) bug. Thank you for your help. Best regards, Stefano Altavilla
... View more