Member since
09-22-2016
33
Posts
3
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4888 | 04-19-2017 12:19 PM | |
1036 | 02-22-2017 05:37 PM | |
6185 | 02-21-2017 02:25 PM |
02-06-2023
05:52 AM
Hello ozzielabrat, I am also facing same issue, what is the solution for it. Thanks
... View more
02-07-2019
07:59 PM
Hi, We are running Spark Streaming job on cluster managed by CM 6. After the spark streaming job has been run for like 4-5 days, the Spark UI for that particular job does not open. It says, logs like this in my nohup driver output file. servlet.ServletHandler: Error for /streaming/ java.lang.OutOfMemoryError: Java heap space These logs are logged many times in a continuous series. But my job keeps on running fine. Its just that I am not able to open up the UI by clicking the Application Master link when I open the job from YARN Running Applications UI.
... View more
04-19-2017
12:19 PM
Need to use this command as kafka user.
... View more
04-14-2017
10:38 AM
It's true that you can aggreate logs to hdfs when the job is still running, however, the minimun log uploading interval (yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds) you can set is 3600 seconds which is 1 hour. The design is trying to protect namenode from being spamed. You may have to use an external service to do the log aggregation. Either write your own or find other tools. Below is the proof from yarn-default.xml in hadoop-common source code (cdh5-2.6.0_5.7.1). <property> <description>Defines how often NMs wake up to upload log files. The default value is -1. By default, the logs will be uploaded when the application is finished. By setting this configure, logs can be uploaded periodically when the application is running. The minimum rolling-interval-seconds can be set is 3600. </description> <name>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</name> <value>-1</value> </property>
... View more
02-22-2017
05:37 PM
You achieve it by setting appropriate value: in yarn-site.xml yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds Then yarn will aggreagate the logs for the running jobs too. https://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml Suri
... View more
10-05-2016
12:49 AM
Cloudera offers Backup and Disaster Recovery (BDR) features as part of its enterprise offering that can do HDFS replication to other clusters, Hive metadata and data replication to other clusters, and also HBase snapshot backups to S3. This is documented in detail at https://www.cloudera.com/documentation/enterprise/latest/topics/cm_bdr_about.html Outside of this you can try to use DistCp for HDFS replication but for Hive replication you will need to manually propagate DDL-associated metadata.
... View more