Member since
02-11-2014
22
Posts
0
Kudos Received
0
Solutions
05-21-2014
01:28 PM
Hello. We are looking to get metrics from our system. We want to limit the result (if possible) to a two week period. According to the documentation (http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-Manager-Diagnostics-Guide/cm5dg_view_charts.html) there should be an export option. I only see a 'edit' or 'edit a copy'. Is there some way to export metrics ? Am I missing something? Cloudera Standard 4.8.1 with postgres. Thanks
... View more
05-21-2014
01:22 PM
I finally gave up.. Then last week we were asked to add new nodes to the cluster. I got the newest version of CDH distro and those machines got added to the cluster properly. I can only assume there is some bug in the other version.
... View more
02-20-2014
05:27 AM
These are all desktop machines. All machines were NTP'd before installation of CDH. I will need to look at the data node logs again. Last time I looked I saw nothing unusual. The datanode was talking to the namenode and writing blocks.. I will try and look at this today but may not be until tomorrow. thanks -- brian a -> Thu Feb 20 08:22:31 EST 2014 o -> Thu Feb 20 08:22:31 EST 2014 at -> Thu Feb 20 08:22:31 EST 2014 av -> Thu Feb 20 08:22:31 EST 2014 b -> Thu Feb 20 08:22:31 EST 2014 m -> Thu Feb 20 08:22:31 EST 2014 be -> Thu Feb 20 08:22:31 EST 2014 ba -> Thu Feb 20 08:22:31 EST 2014 bl -> Thu Feb 20 08:22:31 EST 2014 st -> Thu Feb 20 08:22:31 EST 2014 bla -> Thu Feb 20 08:22:31 EST 2014 bac -> Thu Feb 20 08:22:31 EST 2014 meg -> Thu Feb 20 08:22:31 EST 2014 blas -> Thu Feb 20 08:22:31 EST 2014 br -> Thu Feb 20 08:22:31 EST 2014 bi -> Thu Feb 20 08:22:31 EST 2014 ro -> Thu Feb 20 08:22:31 EST 2014 big -> Thu Feb 20 08:22:31 EST 2014 * note: all hostnames on the left were shortened.
... View more
02-19-2014
09:04 AM
I just added in a new node and it shows dead within hdsfadmin -report but in service in cloudera manager.
... View more
02-19-2014
06:37 AM
I was mistaken that they were the only 2 machines with the largest disks. There are others. The dead machines are optimusprime and megatron. Though the dfsadmin -report says they have 0 capacity this shows they do have capacity. It also shows them in service and dfsadmin report says "Last contact: Wed Dec 31 19:00:00 EST 1969" but according to the chart they seem to have a valid last contact number. I am going to add another machine to the cluster and see if that comes in as dead as well.
... View more
02-18-2014
01:17 PM
I rebuilt all 19 machines (fresh os and fresh cdh). I still have two dead nodes and they are the same ones that were dead before. HA is not on. The only real difference is that these machines have much larger disks than the other 17. Brian
... View more
02-17-2014
11:45 AM
Well i ended up doing some things today to try and solve it. At one point I needed to reboot all the servers for an OS change I made. When I did that the CM manager would not start. Turns out the "/etc/cloudera-scm-server" directory was completely missing. So I ended up reinstalling the CM manager and getting all the nodes re-added and such. I still have the problem. Now I have a node that says it is both dead and alive and a new node that says it is dead. I suspect since the directory was missing something hokey must have occurred during the initial installation. So I think I am just going to wipe everything and install it like I have many times in the past. Brian
... View more
02-17-2014
09:58 AM
to fix this i just downloaded the files from http://archive.cloudera.com/cm4/redhat/6/x86_64/cm/4.7.2/RPMS/x86_64/ and then put them in /var/cache/yum/x86_64/6/cloudera-manager/packages/
... View more
02-17-2014
08:16 AM
I just downloaded ./cloudera-manager-installer.bin and it fails on step 3. When I go to these URLs with my browser I get the message. Oops! We're sorry but we couldn't find the page you're looking for. Total download size: 137 M Installed size: 179 M Downloading Packages: http://archive.cloudera.com/cm4/redhat/6/x86_64/cm/4/RPMS/x86_64/cloudera-manager-daemons-4.7.2-1.cm472.p0.135.el6.x86_64.rpm: [Errno 14] PYCURL ERROR 22 - "The requested URL returned error: 404 Not Found" Trying other mirror. http://archive.cloudera.com/cm4/redhat/6/x86_64/cm/4/RPMS/x86_64/cloudera-manager-server-4.7.2-1.cm472.p0.135.el6.x86_64.rpm: [Errno 14] PYCURL ERROR 22 - "The requested URL returned error: 404 Not Found" Trying other mirror.
... View more
- Tags:
- oops
02-14-2014
10:36 AM
It seems like maybe there is a limit to the # of hosts that I am allowed to have active. I ended up deleting the both the dead nodes (the one that was still dead and the one that was rebuilt as alive). Once I did this the newest dead node that appeared today came back as alive. Is there some setting that puts nodes in standby or some sort of limit to how many nodes can be in a cluster? Thanks Brian
... View more
02-14-2014
09:14 AM
Well the parcel went well. All my hosts are the same versions except for my Mixed Host. My mixed host just happens to be my Cloudera Manager. No change in my dead hosts.. How does one go about fixing a mixed host when it is the cloudera manager? Thanks Brian
... View more
02-14-2014
09:04 AM
I finally got my parcels to be recognized. I am distributing them now. will let you know if that solves the issue.
... View more
02-14-2014
08:49 AM
Well I have good and bad news. I deleted one of the dead nodes. I made a new machine with a new hostname/ip that has not been used in over a year. After I added in the new machine I had a different node become a dead node but the new node is an alive node. So I still have two dead machines, just one of them is a new one that has been alive for a long time. crazy? Brian
... View more
02-13-2014
03:59 PM
the mixed cdh is a single node. using rpms. i have not been able to get parcels to download or get recognized when manually placing them. i have used parcels in the past in different cluster but this time not having luck. I have restarted the NN, Cluster, datanodes, removed roles and re-added roles, rebooted, and any other combo you can think of. Maybe tomorrow I will build a new system with a new hostname and ip and add it to the cluster and see if that new node has a problem as well. If so then maybe there is some file that is messed up or possibly duplicate hostname/ip on the network (but i think i ruled that out pretty thoroughly tuesday). Brian
... View more
02-13-2014
03:46 PM
I checked the hosts page. There are no duplicate host names or IP Addresses. There are no errors that I see with the host inspector. There are about 8 machines that I built at the same time (namenode, 2 dead, 6 alive) that have the exact same CDH versions.. There are about 7 other machines that have a slighlty older version of cdh and one node that is mixed. Validations Inspector ran on all 17 hosts. Individual hosts resolved their own hostnames correctly. No errors were found while looking for conflicting init scripts. No errors were found while checking /etc/hosts. All hosts resolved localhost to 127.0.0.1. All hosts checked resolved each other's hostnames correctly. Host clocks are approximately in sync (within ten minutes). Host time zones are consistent across the cluster. The group oozie is missing on the following hosts: The group hue is missing on the following hosts: The user oozie is missing on the following hosts: The user hue is missing on the following hosts: No kernel versions that are known to be bad are running. No performance concerns with Transparent Huge Pages settings. 1 hosts are reporting with MIXED CDH version There are mismatched versions across the system. See details below for details on which hosts are running what versions of components. All managed hosts have consistent versions of Java. All checked Cloudera Management Daemons versions are consistent with the server. All checked Cloudera Management Agents versions are consistent with the server.
... View more
02-13-2014
02:47 PM
2.0.0-cdh4.5.0 (Cloudera Express). I do not think HA is an option with this version. From NN UI. Live Nodes : 16 (Decommissioned: 0) Dead Nodes : 2 (Decommissioned: 0)
... View more
02-13-2014
02:13 PM
hosts file is fine. this is in the namenode log on the namenode (hadoop-cmf-hdfs1-NAMENODE-xxxx.log.out) the log entry below is when i start the hdfs process on 192.168.129.1. 2014-02-13 17:09:47,638 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* registerDatanode: from DatanodeRegistration(192.168.129.1, storageID=DS-463268316-192.168.129.1-50010-1392235735193, infoPort=50075, ipcPort=50020, storageInfo=lv=-40;cid=cluster5;nsid=2147262545;c=0) storage DS-463268316-192.168.129.1-50010-1392235735193 2014-02-13 17:09:47,639 INFO org.apache.hadoop.net.NetworkTopology: Removing a node: /default/192.168.129.1:50010 2014-02-13 17:09:47,639 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default/192.168.129.1:50010 2014-02-13 17:09:47,684 INFO BlockStateChange: BLOCK* processReport: from DatanodeRegistration(192.168.129.1, storageID=DS-463268316-192.168.129.1-50010-1392235735193, infoPort=50075, ipcPort=50020, storageInfo=lv=-40;cid=cluster5;nsid=2147262545;c=0), blocks: 2, processing time: 0 msecs
... View more
02-13-2014
01:46 PM
iptables and selinux are all off. let me look into what you said to check and if anything becomes apparent i will report back. thanks
... View more
02-13-2014
12:41 PM
Well some more information. I found that the files in the /etc/hadoop/conf on these 2 dead nodes were "stock" files. They had no configuration information for my environment. So I did "Deploy Client Configuration" and the correct conf files were placed in /etc/hadoop/conf. I then restarted hdfs and no change, still show dead. I then restarted the cluster and no change, they still show dead in dfsadmin -report. But cloudera manager says they are healthy. Brian
... View more
02-13-2014
09:01 AM
Hello. I am using CDH 4 free version on Centos 6.4. I have a problem that I find strange. I have rebuilt the OS and re-installed each of the dead nodes in case something funky happened. But the problem remains. CDH Manager shows each of my dead nodes as "Good Health, Started, Good." dfsadmin -report shows that they are "dead." However when I write files to hdfs my two dead nodes still get data written to them. Any suggestions? Brian ---------------------------------------------------------------------------- [root@n2.company.com finalized]# pwd /dfs/dn/current/BP-1978397931-192.168.129.13-1383064468925/current/finalized [root@n2.company.com finalized]# du -hs . 3.2G . [root@n2.company.com finalized]# du -hs . 3.4G . ------------------------------------------------------------------------- dfsadmin -report (i left out the good nodes) Datanodes available: 16 (18 total, 2 dead) Dead datanodes: Name: 192.168.129.1:50010 (n1.company.com) Hostname: 192.168.129.1 Decommission Status : Normal Configured Capacity: 0 (0 B) DFS Used: 0 (0 B) Non DFS Used: 0 (0 B) DFS Remaining: 0 (0 B) DFS Used%: 100.00% DFS Remaining%: 0.00% Last contact: Wed Dec 31 19:00:00 EST 1969 Name: 192.168.129.2:50010 (n2.company.com) Hostname: 192.168.129.2 Decommission Status : Normal Configured Capacity: 0 (0 B) DFS Used: 0 (0 B) Non DFS Used: 0 (0 B) DFS Remaining: 0 (0 B) DFS Used%: 100.00% DFS Remaining%: 0.00% Last contact: Wed Dec 31 19:00:00 EST 1969
... View more
- Tags:
- dead node
Labels:
02-13-2014
08:46 AM
Thanks Philip. I found this out about 3 hours after I posted this. I had used that same option in the past but appears I have very short term memory. Thanks again. Brian
... View more
02-11-2014
10:38 AM
Hello: We are using Cloudera free edition CDH4 on Centos 6. I created a local repo of CDH and want to use that versus getting data over our internet connection. However it seems that the scm generates a new yum repo each time it starts up and overwrites my yum.repos.d repo files. I see these repo files in the directory below and those are what seem to be used to overwrite /etc/yum.repos.d /tmp/scm_prepare_node.wKXFKq70/repos/rhel6 cloudera-cdh4.repo cloudera-manager.repo Is there anyway to force scm to use my repo files? Thanks bg
... View more