About Grizzly

Grizzly · ‎10-28-2013

#3 - You were presenting a config based on using SNN instead of NN HA w. ZK & JN, thus the comment. SNN is not present in a NN HA config, they become NN (active) and NN (standby). Secondary NN is the older integration pattern. Its a horrible name for what it actually provided as functionality for protecting the cluster from failure (which was nothing). It was a place to offload work to. #2 - You would need to evaluate workload on the NN/TT nodes to decide if you would be able to get by with that. #1 - Yes, the 3rd node in the list would have the missing JN Understand there are implementations with hive + hbase out there, you just want to avoid co-locating the NN with HBase services. Todd

Grizzly · ‎10-27-2013

Is this for proof of concept/discovery? This would be a tightly packed cluster you are proposing for most production environments IMHO . Our account team provides Solutions Engineering /Solutions Architecture guidance for things like this as you begin to scope out revenue generating activity with a cluster and want enterprise support. Review our blogs discussion here on hardware sizing: http://blog.cloudera.com/blog/2013/08/how-to-select-the-right-hardware-for-your-new-hadoop-cluster/ As a general rule of thumb, try to minimize the services co-located with the NameNodes (NN). JN's and ZK nodes are OK to co-locate with them, we recommend JN's be co-located with the NN's + 1. You have a over-loaded mix of services here in the combination of HBase and Mapreduce services on the nodes (there will be battles for memory if you are under-sized.) If this is a dev or non prod footprint you can get by with what I'm proposing below... HBase can take a lot of memory so you want to monitor. MR jobs are variable based on the complexity and size of what you are doing. Secondary Name Node (SNN) as an architecture is less safe then using Name Node High Availability (NN HA). SNN does not do what you think it does. Read the hadoop operations guide book (3rd edition) to have a better sense of this. Once you enable NN HA you end up deploying 3 zookeeper instances and jounal nodes, so based on what you are presenting you are saddling up for future outage / loss of data if this ends up in prod this way unless you are really really careful (and even then you could get hit). This footprint viability really depends on your workload... so you might end up relocating things once you start observing activity, what I'm proposing below is, at best, a playpen configuration so you can kick the tires and check stuff out. You are using 3 separate DB implementations with Impala (Fast SQL perf), Hive (Slower SQL but more std SQL support) and HBase (Column based DB)... Is your design really requiring all 3 (research them a little bit more, add them if it makes sense after initial deployment). HUE is usually in the mix to facilitate end user web based access too. Realize you can move services after deployment. Note that decommissioning a DataNode takes a while to replicate data blocks off the node to the rest. Read up on Hive and Hive2. Hive2 is for more secure implementations (plus other stuff) NN Active: Jounal Node, ZooKeeper JobTracker, NN (standby): Journal Node, ZK, JobTracker (if using JT HA?) HBaseMaster ZK, All Hive + Metastore, Sqoop, Impala StateStore Daemon, DataNode01: TaskTracker, Impala Daemon, HBase RS DataNode02: TaskTracker, Impala Daemon, HBase RS DataNode03: TaskTracker, Impala Daemon, HBase RS DataNode04: TaskTracker, Impala Daemon, HBase RS Admin01 (VM): Cloudera Manager Cloudera Mgmt Services (Monitors), DB for Monitoring. Note that using a VM for this will be a "heavy vm" with regards to network IO/Disk IO as cluster activity scales.

Grizzly · ‎10-27-2013

Yes, There are dependancies, add your second DVD to the repo, as there are older releases of the cyrus-sasl libs that are pulled in by the 1.6.0_31 JDK that are on DVD2. Not having RHEL repo and patch access is not reccomended for production environments however. Todd

Grizzly · ‎10-16-2013

Per e-mail thread discussion 1.7 JDK was manually added before cluster upgrade, and was present within custom local yum repo at the site. CM/CDH will pick the later JDK present to use. Per Phillp L: Yes. We ask yum to install jdk and if yum decides that it's appropriate to install a newer version of jdk than is already installed, that is what it will do. If you want to keep a package on a particular version, even if a newer version is available in a repo, there is a way to make yum do that: http://blog.serverbuddies.com/how-can-i-get-yum-to-keep-package-at-a-certain-version/

Grizzly · ‎10-04-2013

Nothing is IPv6 only at this point, so you don't have to worry. Its going to take a while before we (the internets) can do IPv6 only for things.

Grizzly · ‎10-04-2013

From what it looks like, you have 32 bit ubuntu installed, we only support 64bit installation at this point. http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/Cloudera-Manager-Installation-Guide/cmig_cm_requirements.html

Grizzly · ‎10-04-2013

(text from the attached file "vmstat-test.txt" - from the mail thread) [root@cehd3 ~]# for i in {1..12}; do date ; echo "sudo -u hdfs hadoop fs -mkdir /foo$i";sudo -u hdfs hadoop fs -mkdir /foo$i; done Fri Oct 4 09:57:24 MDT 2013 sudo -u hdfs hadoop fs -mkdir /foo1 Fri Oct 4 09:57:26 MDT 2013 sudo -u hdfs hadoop fs -mkdir /foo2 Fri Oct 4 09:57:27 MDT 2013 sudo -u hdfs hadoop fs -mkdir /foo3 Fri Oct 4 09:57:29 MDT 2013 sudo -u hdfs hadoop fs -mkdir /foo4 Fri Oct 4 09:57:31 MDT 2013 sudo -u hdfs hadoop fs -mkdir /foo5 Fri Oct 4 09:57:33 MDT 2013 sudo -u hdfs hadoop fs -mkdir /foo6 Fri Oct 4 09:57:35 MDT 2013 sudo -u hdfs hadoop fs -mkdir /foo7 Fri Oct 4 09:57:36 MDT 2013 sudo -u hdfs hadoop fs -mkdir /foo8 Fri Oct 4 09:57:38 MDT 2013 sudo -u hdfs hadoop fs -mkdir /foo9 Fri Oct 4 09:57:40 MDT 2013 sudo -u hdfs hadoop fs -mkdir /foo10 Fri Oct 4 09:57:42 MDT 2013 sudo -u hdfs hadoop fs -mkdir /foo11 Fri Oct 4 09:57:44 MDT 2013 sudo -u hdfs hadoop fs -mkdir /foo12 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ ---timestamp--- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 10875508 261608 560260 0 0 0 5 17 9 0 0 99 0 0 2013-10-04 09:57:22 MDT 1 0 0 10857096 261608 560260 0 0 0 42 299 519 8 1 91 0 0 2013-10-04 09:57:24 MDT 2 0 0 10842448 261608 560292 0 0 0 0 1150 845 94 5 1 0 0 2013-10-04 09:57:26 MDT 1 0 0 10835436 261608 560292 0 0 0 0 1051 838 96 4 0 0 0 2013-10-04 09:57:28 MDT 3 0 0 10826628 261608 560292 0 0 0 44 1105 874 92 7 1 0 0 2013-10-04 09:57:30 MDT 3 0 0 10820764 261608 560296 0 0 0 0 1058 858 96 4 0 0 0 2013-10-04 09:57:32 MDT 2 0 0 10815676 261608 560332 0 0 0 2 1118 899 94 6 1 0 0 2013-10-04 09:57:34 MDT 3 0 0 10794528 261608 560300 0 0 0 34 1118 838 95 5 0 0 0 2013-10-04 09:57:36 MDT 1 0 0 10777340 261608 560300 0 0 0 22 1086 823 95 4 1 0 0 2013-10-04 09:57:38 MDT 4 0 0 10864748 261608 560300 0 0 0 20 1123 964 93 7 1 0 0 2013-10-04 09:57:40 MDT 1 0 0 10849984 261608 560300 0 0 0 28 1027 791 95 5 0 0 0 2013-10-04 09:57:42 MDT 3 0 0 10829008 261608 560300 0 0 0 0 1037 946 93 6 1 0 0 2013-10-04 09:57:44 MDT

Grizzly · ‎10-04-2013

In your hosts file; do not comment out the loopback interface (127.0.0.1) just let that be its normal values, you can allow the ipv6 value to be set as well, it is not necessary to comment either of those out. From your command line in the shell, do a "getent hosts cdh1.jsnewland.com" and "getent hosts 192.168.125.135" to verify name resolution is doing what you want. If it comes back with unexpected values, verify in your vm's that /etc/nsswitch.conf is set for "hosts files dns" in that order, rather than "hosts dns files". What is the host OS you are using for the VM? If you had a 8GB system, you would be much better off running a single 3 to 4 GB VM. You need to realize the parent OS (especially if GUI desktop is in use) is going to need memory, including overhead to run the actual vm servers and instances. At this scale of physical system (6GB RAM); attempting to emulate a cluster of 3 x 2GB nodes is going to get in the way of your attempt to use hadoop. Take a look at our Example VM that is available for download, it's set up to run in a laptop/desktop configuration. The sample vm uses 4GB as its base memory configuration. For the vmstat information you provided, here is the breakdown of what it is telling you: The attached file vmstat-test.txt is a test of making a path 12 times on a VM with 12GB RAM with 6GB swap configured, on a physical host with 128GB ram. Note the differences from your output. Note in the explanation of the column titles for vmstat, my tag of "<---" below indicate what you should focus on when evaluating vmstat output. Compare your vmstat to the test I did in the attached file. You are heavily swapping. It's not a question of being out of swap (it would crash at that point), its the volume of activity of paging back and forth that is literally choking the VM. Below is your vmstat re-pasted: # vmstat 2 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 21 913364 52192 640 20384 212 411 673 441 266 521 8 3 67 22 0 1 23 912360 60864 496 19840 1398 572 1912 576 356 578 8 3 0 90 0 0 17 911292 57136 504 19892 2200 304 2256 304 340 621 4 3 0 93 0 1 17 909904 50180 536 22032 1530 18 2600 18 376 630 7 5 0 88 0 1 15 908268 49372 536 23460 1906 22 2614 22 341 643 3 3 0 95 0 2 19 906812 49084 544 25304 1838 0 3014 0 328 778 3 5 0 92 0 0 15 906036 49152 532 26032 1582 220 2500 220 297 540 3 5 0 92 0 3 16 908180 62844 536 23092 1460 2220 2286 2220 477 591 18 8 0 74 0 2 12 906608 58860 536 25644 1830 0 3120 0 440 603 10 11 0 79 0 3 21 904808 53412 536 26244 2370 0 2668 0 578 767 5 9 0 86 0 Now to understand how to read the vmstat output. Procs r: The number of processes waiting for run time. b: The number of processes in uninterruptible sleep. <<--- Memory swpd: the amount of virtual memory used. <<---- free: the amount of idle memory. buff: the amount of memory used as buffers. cache: the amount of memory used as cache. inact: the amount of inactive memory. (-a option) active: the amount of active memory. (-a option) Swap si: Amount of memory swapped in from disk (/s). <<<---- so: Amount of memory swapped to disk (/s). <<<---- IO bi: Blocks received from a block device (blocks/s). bo: Blocks sent to a block device (blocks/s). System in: The number of interrupts per second, including the clock. <---- cs: The number of context switches per second. <---- CPU These are percentages of total CPU time. us: Time spent running non-kernel code. (user time, including nice time) sy: Time spent running kernel code. (system time) id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time. wa: Time spent waiting for IO. Prior to Linux 2.5.41, included in idle. <<<---- st: Time stolen from a virtual machine. Prior to Linux 2.6.11, unknown.

Grizzly · ‎10-04-2013

(pasted from mail thread discussion) Thanks for your replay. When the high disk IO fault occurs, there are still some remaining memory on my vm. The following is the fault occurs, the "vmstat 2" command returns information # vmstat 2 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 21 913364 52192 640 20384 212 411 673 441 266 521 8 3 67 22 0 1 23 912360 60864 496 19840 1398 572 1912 576 356 578 8 3 0 90 0 0 17 911292 57136 504 19892 2200 304 2256 304 340 621 4 3 0 93 0 1 17 909904 50180 536 22032 1530 18 2600 18 376 630 7 5 0 88 0 1 15 908268 49372 536 23460 1906 22 2614 22 341 643 3 3 0 95 0 2 19 906812 49084 544 25304 1838 0 3014 0 328 778 3 5 0 92 0 0 15 906036 49152 532 26032 1582 220 2500 220 297 540 3 5 0 92 0 3 16 908180 62844 536 23092 1460 2220 2286 2220 477 591 18 8 0 74 0 2 12 906608 58860 536 25644 1830 0 3120 0 440 603 10 11 0 79 0 3 21 904808 53412 536 26244 2370 0 2668 0 578 767 5 9 0 86 0 The physical machine has 6GB memory. My CDH cluster has three hosts , they are all running on my physical machine. The Cloudera Manager are install on the cdh1. $ hostname cdh1 $ ifconfig -a eth0 Link encap:Ethernet inet addr:192.168.125.135 Bcast:192.168.125.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fec3:dda3/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:97 errors:0 dropped:0 overruns:0 frame:0 TX packets:144 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:67253 (65.6 KiB) TX bytes:13328 (13.0 KiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:15359 errors:0 dropped:0 overruns:0 frame:0 TX packets:15359 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:17372886 (16.5 MiB) TX bytes:17372886 (16.5 MiB) $ cat /etc/hosts #127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 #::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.125.135 cdh1.jsnewland.com cdh1 192.168.125.136 cdh2.jsnewland.com cdh2 192.168.125.137 cdh3.jsnewland.com cdh3

Grizzly · ‎10-04-2013

(pasted from mail thread discussion) 2GB is going to be tough to prevent swapping of the vm back and for between disk and ram... how much physical ram is available on the machine you are running the VM on? We run with 4GB in the demo VM (that might be worth downloading and using to check things out). Also what does the following commands show in your VM. # hostname and # ifconfig -a and # cat /etc/hosts

Online	Offline
Last Visited	‎04-29-2020 06:25 PM

Member Since	‎09-23-2013 10:26 AM
Last Visited	‎04-29-2020 06:25 PM
Posts	238
Kudos received	72

Cloudera Community

Re: Remove certificate manager for Auto-TLS to Cus...

Re: CM's Metadata Data Store (Postgres)

Re: Weirdest reason for Cloudera Management Servic...

Re: package installation for Kafka

Re: Kerberos Generate Credentials fails

Re: Help with Role Assignments / New Install

Re: Help with Role Assignments / New Install

Re: Cloudera Installation on Red Hat Enterprise 6....

Re: Fresh Cloudera Manager 4.7.2 brought Java 1.7 ...

Re: Is it mandatory to disable IPV6 on all the nod...

Re: Error installing on ubuntu

Re: Install the clouder Management services caught...

Re: Install the clouder Management services caught...

Re: Install the clouder Management services caught...

Re: Install the clouder Management services caught...