Member since
07-31-2013
1924
Posts
462
Kudos Received
311
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2244 | 07-09-2019 12:53 AM | |
| 12773 | 06-23-2019 08:37 PM | |
| 9841 | 06-18-2019 11:28 PM | |
| 10828 | 05-23-2019 08:46 PM | |
| 5097 | 05-20-2019 01:14 AM |
05-15-2019
06:52 PM
1 Kudo
The Disk Balancer sub-system is local to each DataNode and can be triggered on distinct hosts in parallel. The only time you should receive that exception is if the targeted DN's hdfs-site.xml does not carry the property that enables disk balancer, or when the DataNode is mid-shutdown/restart. How have you configured disk balancer for your cluster? Did you follow the configuration approach presented at https://blog.cloudera.com/blog/2016/10/how-to-use-the-new-hdfs-intra-datanode-disk-balancer-in-apache-hadoop/? What is your CDH and CM version?
... View more
05-09-2019
02:39 AM
1 Kudo
Spark running on YARN will use the temporary storage presented to it by the NodeManagers where the containers run. These directory path lists are configured via Cloudera Manager -> YARN -> Configuration -> "NodeManager Local Directories" and "NodeManager Log Directories". You can replace its values to point to your new, larger volume, and it will cease to use your root partition. FWIW, the same applies for HDFS if you use it. Also see: https://www.cloudera.com/documentation/enterprise/release-notes/topics/hardware_requirements_guide.html
... View more
05-09-2019
02:09 AM
Quoted from documentation about using Avro files at https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_avro_usage.html#topic_26_2 """ Hive (…) To enable Snappy compression on output [avro] files, run the following before writing to the table: SET hive.exec.compress.output=true; SET avro.output.codec=snappy; """ Please try this out. You're missing only the second property mentioned here, which appears specific to Avro serialization in Hive. Default compression of Avro is deflate, so that explains the behaviour you observe without it.
... View more
05-07-2019
09:58 PM
Depends on what you mean by 'storage locations'. If you mean "can other apps use HDFS?" then the answer is yes, as HDFS is an independent system unrelated to YARN and has its own access and control mechanisms not governed by a YARN scheduler. If you mean "can other apps use the scratch space on NM nodes" then the answer is no, as only local containers get to use that. If you're looking to strictly split both storage and compute, as opposed to just some form of compute, then it may be better to divide up the cluster entirely.
... View more
05-07-2019
05:24 PM
The simplest way is through Cloudera Hue. See http://gethue.com/new-apache-oozie-workflow-coordinator-bundle-editors/ That said, if you've attempted something and have run into issues, please add more details so the community can help you on specific topics.
... View more
05-07-2019
05:21 PM
It would help if you add along some description of what you have found or attempted, instead of just a broad question. What load balancer are you choosing to use? We have some sample HAProxy configs at https://www.cloudera.com/documentation/enterprise/latest/topics/impala_proxy.html#tut_proxy for Impala that can be repurposed for other components. Hue also offers its own pre-optimized Load Balancer as roles in Cloudera Manager that you can add and have it setup automatically: https://www.cloudera.com/documentation/enterprise/latest/topics/hue_perf_tuning.html
... View more
04-13-2019
01:40 AM
Harsh J: Thanks for the help on the previous issue. We finally resolved the issue. It was due to an undocumented port required in the CDH 6.2 to CDH 6.2 distcp. Now, we are migrating the task over to Oozie and having some trouble. Could you elaborate a bit more or give us some links or pointers? Thanks. We could not find "mapreduce.job.hdfs-servers" . Where is that?
... View more
04-10-2019
12:31 AM
1 Kudo
One possibility could be the fetch size (combined with some unexpectedly wide rows). Does lowering the result fetch size help? >From http://sqoop.apache.org/docs/1.4.7/SqoopUserGuide.html#idp774390917888 : --fetch-size Number of entries to read from database at once. Also, do you always see it fail with the YARN memory kill (due to pmem exhaustion) or do you also observe an actual java.lang.OutOfMemoryError occasionally? If it is always the former, then another suspect would be some off-heap memory use done by the JDBC driver in use, although I've not come across such a problem.
... View more
03-07-2019
08:30 AM
thanks a ton !!
... View more
- « Previous
-
- 1
- 2
- Next »