Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

I want to implement fault toleranve at task level via checkpointing and via replication in hadoop?

I want to implement fault toleranve at task level via checkpointing and via replication in hadoop?

New Contributor

I want to implement two techniques in Hadoop.

Fault tolerance of tasks via checkpointing

Fault tolerance of tasks via replication.

::::Checkpointing:::

when we use this , then if a task fail it should start from the last checkpoint.

::::Replication::::

When we implement this, then a task will start at another node where replica is stored.

@tspann

1 REPLY 1
Highlighted

Re: I want to implement fault toleranve at task level via checkpointing and via replication in hadoop?

Expert Contributor
@Hassan Asghar

Most of it depends on what you are trying to do exactly.

There is checkpointing available in Spark:

https://spark.apache.org/docs/latest/streaming-programming-guide.html#checkpointing

HDFS is by itself replicated and fault tolerant.

This is also available at component level but implementation varies. For example:

These are some examples. You can search how fault tolerance/replication is implemented in other components.

Don't have an account?
Coming from Hortonworks? Activate your account here