Member since
10-06-2021
4
Posts
0
Kudos Received
0
Solutions
02-01-2023
03:48 AM
We are running a spark streaming job which reads data from kafka and writes to RDBMS. We dont want this job to fail easily due to minor fluctuations in cluster health/issue. Write now spark job has configuration to retry 5 attempts before the whole job fails. But all these retries are happening in quick succession , one after another. Is there a way we can put some delay/sleep time between retry attempts for this job?
... View more
Labels:
- Labels:
-
Apache Spark
10-06-2021
10:00 PM
I have a new Cloudera Cluster With services being used - Hdfs, Spark, Yarn, Hive, Kafka, Sentry , Oozie, Zookeeper, Streams Replication Manager, Streams Messaging Manager. I am working on implementing some processes in place about how to plan for OS Pathing for cluster nodes , When Services restart would be required and Sequence of steps and all. is there a defined documentation from cloudera in this regard which i can refer to. Any hints please.
... View more
Labels:
10-06-2021
09:52 PM
In Cloudera Manager., I could always see this Alert in health Tests Section - NameNode summary: xyz.abc.net (Availability: Unknown, Health: Good), pqr.abc.net (Availability: Unknown, Health: Good). This health test is bad because the Service Monitor did not find an active NameNode. I do not see any service Disruption and both the servers are up and running. Looking for help what need to be checked and why this heath test alert is persistent.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Cloudera Manager
-
HDFS