Member since
09-17-2018
8
Posts
0
Kudos Received
0
Solutions
09-06-2019
07:06 AM
Yes, we had the same dilemma when creating a fall-back queue as it doesn’t respect our model either! We observed the compress files in HDFS Oozie job to be allocated 1 container with 2GiB of memory and 1 VCore in YARN. We use a 5 VCore and 10GiB resource queue and the largest amount of data we’ve compressed is 100GiB. The YARN resource allocation doesn’t seem to change based on amount of data being compressed and therefore I think the YARN queue will not be limiting. As discussed earlier in the thread the architecture of the compress files in HDFS feature doesn’t appear to be very scalable: 1. All the data being compressed is first localized (copied) to a YARN Node Manager’s local cache (one directory is chosen from yarn.nodemanager.local-dirs). This requires enough local disk space on the partition where the directory resides. 2. The zip shell command is run locally on the same YARN node and uses 1x CPU core; the default zip compress is quite slow. 3. Enough space is required in local /tmp to hold a copy of the completed zip file before it is copied up to HDFS. Without any documentation on the compress files in HDFS feature this is just my opinion based on observations in our environment and reverse engineering. Kind regards, Julian
... View more
08-30-2019
07:42 AM
I've not found a way to specify the resource queue for the compression oozie job. We created a root.user queue with small max resources to catch cases where the queue cannot be specified. Not ideal but works around the problem. To update my original message: 1. The oozie user permission requirement has been fixed in CDH 6.3.0 with OOZIE-3478 2. The zip shell command was installed on all nodes 3. Not found a way to move the local tmp directory used by the oozie job Regards, Julian
... View more
04-29-2019
02:49 AM
Hue offers the abitlity to compress files in HDFS as follows:
1. Select one or more HDFS files in the Hue File Browser.
2. On the Actions menu select the Compress option.
Is there any documention about how to configure the cluster to support this?
So far it appears the following is required:
1. oozie user must have exexute permission on the HDFS directory tree
2. 'zip' shell command must be available on all HDFS data nodes?
3. Sufficient space in local (not HDFS) /tmp on all data nodes to hold the resulting compressed file.
Using local /tmp renders this feature unsable for large HDFS files. Can the local temp directory for this be changed?
... View more
Labels:
- Labels:
-
Apache Oozie
-
Cloudera Hue
-
HDFS
01-17-2019
02:03 AM
Are HBase Quotas supported in CDH 6.x?
The hbase.quota.enabled property defaults to false but this property is not listed under the HBase service configuration in Cloudera Manager 6.
... View more
Labels:
- Labels:
-
Apache HBase
-
Cloudera Manager
01-17-2019
01:05 AM
Thanks for the clarification, the Java requirements page is clear with tested and recommended versions. Would it make sense to include "1.8.0-<update_version>" on the upgrading JDK page? I installed Java OpenJDK 1.8.0 on Redhat/CentOS 7 as follows: su -c yum install java-1.8.0-openjdk-1.8.0.181 java-1.8.0-openjdk-devel-1.8.0.181 java-1.8.0-openjdk-headless-1.8.0.181 Then I added exclude=java-1.8.0-openjdk* to /etc/yum.conf.
... View more
01-16-2019
05:08 AM
Hello The documentation for upgrading/migrating the JDK shows installing Java OpenJDK 1.8 using the package manager and therefore from the OS repositories, which for Redhat/CentOS 7 is: su -c yum install java-1.8.0-openjdk-devel Does this mean that the whichever version (1.8.0.191 at time of writing) and all future versions of OpenJDK 1.8 are supported for Cloudera Manager and CDH 6.x? The above command is taken from the documentation here: https://www.cloudera.com/documentation/enterprise/upgrade/topics/ug_jdk8.html Thanks, JG
... View more
Labels:
- Labels:
-
Cloudera Manager