Member since
07-31-2013
1924
Posts
462
Kudos Received
311
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1969 | 07-09-2019 12:53 AM | |
| 11881 | 06-23-2019 08:37 PM | |
| 9146 | 06-18-2019 11:28 PM | |
| 10133 | 05-23-2019 08:46 PM | |
| 4580 | 05-20-2019 01:14 AM |
03-16-2018
10:38 PM
It appears that your HMaster is crashing out during startup. Take a look at the HMaster log file under /var/log/hbase/ to investigate why. If you are able to run the configured ZK properly, check if the /hbase znode appears on it.
... View more
03-16-2018
10:00 PM
1 Kudo
The two exams (CCA and CCP) are independent of each other. As to dates and time, you can select a from a shown range to fit what works for you.
... View more
03-15-2018
08:20 AM
A few checks: - Does the host where you invoke spark-submit carry a valid Spark Gateway role, with deployed configs under /etc/spark/conf/? There's also a classpath file under that location, which you may want to check to see if it includes all HDFS and YARN jars. - Do you bundle any HDFS/YARN project jars in your Spark App jar (such as a fat-jar assembly)? You may want to check the version matches with what is on the cluster classpath. - Are there any global environment variables (run 'env' to check) that end in or carry 'CLASSPATH' in their name? Try unsetting these and retrying.
... View more
03-01-2018
01:45 AM
You've mentioned the RAM of the machine your DataNode is assigned to run on, however what is your configured DataNode Java JVM heap size? You could try raising it by 1 GB from its current value, to resolve this. Also, what's the entire Out of Memory message, 'cause "unable to create a new native thread" (or summat) is entirely different than "Java heap space" in what it implies (nproc limit issue vs. actual heap memory exhaustion).
... View more
02-12-2018
10:13 PM
What CDH version, and could you attach/pastebin the full stack trace dump that the log produces? I'd also lookout for a FATAL message in the logs. A self-abort in NameNode should always carry that.
... View more
02-02-2018
02:25 AM
1 Kudo
Yes that is precisely correct - it balances by average utilization percentage per node rather than by average byte count.
... View more
01-30-2018
04:19 AM
To change the whole log directory you'll currently need to pass --logdir to the agent as arguments, instead of via a config flag. Edit your agent environment config file at /etc/default/cloudera-scm-agent and ensure that the CMF_AGENT_ARGS env-var inside it carries the below: --logdir=/your/custom/cloudera-scm/user/writable/directory/ Save, then restart the agent service. P.s. Using symlinks will also work.
... View more
01-07-2018
07:09 PM
You will need to specify your custom endpoint URL too, besides credentials, just like is done on the page 12 of the document you've referenced, but with property 'fs.s3a.endpoint' instead (for s3a). See http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-common/core-default.xml#fs.s3a.endpoint Without specifying your custom endpoint URL, the requests will go to Amazon's S3 servers instead (default).
... View more
12-11-2017
05:19 PM
1 Kudo
The subdirs carry actual block data - deleting these would be fatal for your actual HDFS data. If you have a space problem, clear out files on HDFS by issuing regular deletes (fs -rm, etc.), not by messing around with the internal storage format on independent DataNodes. Be sure to also check if you have stale HDFS snapshots retaining older files. The reason DNs use a subdirectory structure is mostly to avoid hitting its underlying filesystem's (ext4, xfs, etc.) limits, and to make certain scanning operations (such as for block reports) more efficient.
... View more
11-20-2017
05:18 PM
1 Kudo
The default behaviour of Hadoop is to run things locally, in face of no found YARN cluster configuration. In CM managed clusters, cluster configuration for client programs are deployed by means of a Gateway role. Your edge host is missing a gateway role and the subsequent config files required to discover and use the cluster daemons. Do these two steps: 1. Visit YARN -> Instances page in CM, then click 'Add Role Instances' and under the Gateway type in the modal dialog, find and add your edge hostname to it (this edge host should already be running a CM agent for it to show up here). 2. Deploy cluster-wide client configs, following this: https://www.youtube.com/watch?v=4S9H3wftM_0 Retry your commands after this completes. Also verify that your edge host now has a proper /etc/hadoop/conf symlink, with the directory contents carrying info about the cluster. P.s. Having HDFS Gateways is insufficient to connect to YARN, you will need a YARN Gateway to connect to YARN.
... View more