Member since
07-31-2013
1924
Posts
462
Kudos Received
311
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1975 | 07-09-2019 12:53 AM | |
| 11893 | 06-23-2019 08:37 PM | |
| 9159 | 06-18-2019 11:28 PM | |
| 10150 | 05-23-2019 08:46 PM | |
| 4587 | 05-20-2019 01:14 AM |
08-26-2017
09:46 PM
1 Kudo
You may only use the -Dname=value form if your main class implements the Tool interface and gets invoked via the ToolRunner utility. Check the Tool javadoc example and model your implementation around it: http://archive.cloudera.com/cdh5/cdh/5/hadoop/api/org/apache/hadoop/util/Tool.html
... View more
07-30-2017
09:52 PM
Could you share your drive formatting options? The overall inode capacity seems to be very low - did you format with some special options for "fewer, larger files" perhaps?
... View more
07-26-2017
10:01 AM
1 Kudo
Cloudera supports Apache Spark, upon which an Apache Beam runner exists. I assume this is what you'd meant to ask about? Apache Beam by itself is not a service that needs installation and management (such as via Cloudera Manager), but is rather a programming model that supports various execution modes (one of which is Apache Spark). You should be able to follow the tutorial on https://beam.apache.org/get-started/quickstart-java/ and https://beam.apache.org/documentation/runners/spark/ without trouble, just ensure to use the CDH version of Apache Spark when configuring your Java application's pom.xml. There's no direct support offered for Apache Beam SDKs by Cloudera at present, but I see no reason for it to not work on top of your CDH cluster.
... View more
07-22-2017
10:24 AM
HBase is required to perform log split if an RS goes down uncleanly. On why your RSs went down uncleanly, you'd need to check for FATAL messages in your independent RS logs, as the reason is not in the Master log snippet posted above. The dead server appears to have been hostnamedn02.com. On why the log splitting fails, since Master does a distributed log split, the reason of failure would also exist on the alive RS logs that tried to assist with the log splitting. In the snippet posted above, this host was hostnamedn01.com and hostnamedn05.com.
... View more
07-22-2017
10:20 AM
This is the same question as http://community.cloudera.com/t5/Storage-Random-Access-HDFS/How-to-connect-to-remote-Hbase-using-JAV..., where a reply is available.
... View more
07-22-2017
10:19 AM
This is the same question as http://community.cloudera.com/t5/Storage-Random-Access-HDFS/How-to-connect-to-remote-Hbase-using-JAVA-API/m-p/57731#M3059, where a reply is available.
... View more
07-22-2017
10:18 AM
HBase API calls would involve connecting to every HBase service role host on the cluster from the host you are executing on. This requires proper resolution available to discover all RegionServer and Master hostnames. In your case, your client host is able to resolve the passed ZK hostname of "en01com", but it must also be able to resolve every Master/RS host such as dn03.com. If you do not rely on a DNS backend to do this for you, your /etc/hosts file must carry every cluster host's entry in the below form: IP FQDN OptionalShortName
... View more
07-16-2017
11:45 PM
While your idea is correct in trying a different tmp path that allows execution and loading of libraries (your current /tmp may be mounted with 'noexec' applied, see output of 'mount' command), try specifying the alternative tmp path like this: ~> export HBASE_OPTS='-Djava.io.tmpdir=/ngs12/tmp' ~> hbase shell
... View more
07-10-2017
11:38 PM
> We have the superuser group defined as 'supergroup' in our configuration. However, this goup does not exist in any of the nodes. This is intentional. The default is set to a name (supergroup) that typically shouldn't exist by default, to protect against unintentional super-users right after install. You are free to modify the supergroup name via the HDFS -> Configuration -> "Superuser Group" field. > If I have to set up this group and start adding a couple of other accounts to have super usr access to hdfs, where should this Linux group be created? Should it be created in all nodes in the cluster? Or is it sufficient to create the Linux group in the Namenode hosts only? The general and bulletproof approach to adding Linux local groups and usernames in cluster is always "all hosts" when you use no centralized user/group management software (such as an AD via LDAP, etc.). The reason is that your host assignments are not static in the life of the cluster, so while doing the group additions on the NameNode(s) will work immediately, you will face weird authorization issues in future when a NameNode host needs to be migrated or replaced with another. Likewise when security may be turned on in future, it'd require local accounts on worker hosts.
... View more
07-09-2017
06:50 AM
1 Kudo
This occurs due to the actions inheriting YARN NM configs which are not pre-configured for MR2. Since MR2 is an app-side concept in YARN and not an inbuilt/server-side one, your action environment does not find the adequate configs by referencing the NM ones. This was improved via https://issues.apache.org/jira/browse/OOZIE-2343 in CDH 5.5.0+, which ships configs along with the shell scripts that include MR2 specifics. For your older CDH version however, you can try the below: Step 1: Ensure all your hosts have a YARN/MR2 Gateway role added on it, and that client configuration is deployed on all hosts at /etc/hadoop/conf/*. Step 2: Add the env-var 'HADOOP_CONF_DIR=/etc/hadoop/conf' to all shell actions via the shell action configuration for passing environments, or via manual edits to the top of the shell scripts.
... View more