Support Questions

Find answers, ask questions, and share your expertise

Is it recommended to use same set of journal nodes for two hdfs instances ( namespace 1 and namespace 2) with different shared edits in production environment?

avatar

Namespace1:

<property>

<name>dfs.namenode.shared.edits.dir</name> <value>qjournal://node1.example.com:8485;node2.example.com:8485;node3.example.com:8485/mycluster1</value> </property>

Namespace2:

<property>

<name>dfs.namenode.shared.edits.dir</name> <value>qjournal://node1.example.com:8485;node2.example.com:8485;node3.example.com:8485/mycluster2</value> </property>

Please advise.

1 ACCEPTED SOLUTION

avatar

It is definitely possible to do that, however I would not recommend it, especially in a production environment. These JN processes are just lightweight daemons, so you can place them on the same nodes with other master services. Using one Quorum for multiple clusters increases the risk and chance of affecting the health/stability of all the attached clusters. For example if Cluster A brings down your JN Quorum (for whatever reason), the Namenodes of Cluster B cant synchronize their state and will shutdown eventually because the Quorum is not available =>

2016-02-16 22:55:55,550 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [XXXXX:8485, XXXXXX:8485, xXXXX:8485], stream=QuorumOutputStream starting at txid 51260))
java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.

View solution in original post

2 REPLIES 2

avatar

It is definitely possible to do that, however I would not recommend it, especially in a production environment. These JN processes are just lightweight daemons, so you can place them on the same nodes with other master services. Using one Quorum for multiple clusters increases the risk and chance of affecting the health/stability of all the attached clusters. For example if Cluster A brings down your JN Quorum (for whatever reason), the Namenodes of Cluster B cant synchronize their state and will shutdown eventually because the Quorum is not available =>

2016-02-16 22:55:55,550 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [XXXXX:8485, XXXXXX:8485, xXXXX:8485], stream=QuorumOutputStream starting at txid 51260))
java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.

avatar

+1 Another consideration is upgrades. Sharing the same set of JournalNodes across multiple clusters would complicate upgrade plans, because an upgrade of software on those JournalNodes potentially impacts every cluster served by those JournalNodes.