Support Questions

Find answers, ask questions, and share your expertise

Spark in YARN with Namenode HA

avatar
Contributor

Hi HWX,

I do have HDP2.3.4 with Namenode HA. I submitted Spark jobs properly until one Namenode went down. Since then, no more Spark jobs are starting.

It looks like the HDFS client is not falling back properly to the second Namenode properly:

$ hdfs dfs -ls /tmp
... working fine...
$ spark-shell --master yarn-master
...snip...
16/01/12 22:57:53 INFO ui.SparkUI: Started SparkUI at http://10.10.10.3:7884
16/01/12 22:57:53 WARN metrics.MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
16/01/12 22:57:54 INFO impl.TimelineClientImpl: Timeline service address: http://daplab-wn-12.fri.lan:8188/ws/v1/timeline/
16/01/12 22:53:16 INFO ipc.Client: Retrying connect to server: daplab-rt-11.fri.lan/10.10.10.111:8020. Already tried 0 time(s); retry policy is RetryPolicy[MultipleLinearRandomRetry[500x2000ms], TryOnceThenFail]
16/01/12 22:53:20 INFO ipc.Client: Retrying connect to server: daplab-rt-11.fri.lan/10.10.10.111:8020. Already tried 1 time(s); retry policy is RetryPolicy[MultipleLinearRandomRetry[500x2000ms], TryOnceThenFail]

.. and so on until I lost my patience...

If I'm changing the ip address in /etc/hosts to point 10.10.10.111 to the active namenode, then it is moving forward.

As I said it's a fresh HDP 2.3.4 install, without anything fancy.

Thanks

Benoit

1 ACCEPTED SOLUTION

avatar

java.net.NoRouteToHostException is considered a failure that can be recovered from in any deployment with floating IP addresses. This was essentially the sole form of failover in Hadoop pre NN-HA (HADOOP-6667 added the check). I think we ought to revisit that decision

View solution in original post

10 REPLIES 10

avatar
Contributor

How to reproduce: HDP 2.3.4 with NameNode HA, kill the namenode listed first in the configuration, try to launch spark-shell in yarn mode.

avatar
Master Mentor

I think you have bigger issues than having your Spark application not running. You need to fix the HA failover first before you can do anything else as it's the core of your environment. Go to the /var/log/hadoop/hdfs/ directory on the 2nd node where namenode is not coming up and review the logs, paste them here and we may help you out. @Benoit Perroud

avatar
Contributor

Well, Hive and everything else is working fine in this degraded HA mode, the only one having problems is Spark.

Active Namenode logs (I just striped out BlockStateChange noise) is given below. There's no stacktrace or anything validating the hypothesis of problem in the HA setup.

I reproduced the problem on several environments as follow:

  1. Setup a fresh cluster with NameNode HA
  2. spark-shell --master yarn-master works fine
  3. Figure out which namenode is referred as nn1 in hdfs-site.xml
  4. Shutdown this namenode referred as nn1
  5. spark-shell --master yarn-master is failing, waiting infinitely on nn1:8020

If you shutdown the second Namenode in the list of dfs.ha.namenodes.<nameservice>, spark is still working fine.

This makes me thinking that Spark HDFS client, as surprising as it could be, is not falling back to the other namenode.

After turning on more logging, it looks like by default it is trying 500 times before falling back to the other namenode:

There are some tickets speaking about dfs.client.retry.policy.enabled to be turned on or off (https://issues.apache.org/jira/browse/AMBARI-11192 and https://issues.apache.org/jira/browse/AMBARI-12138, as well as https://issues.apache.org/jira/browse/HADOOP-12078), but in the case of Spark, this does not seem to have any impact...

My reading of the logs tells me that, for some reason, java.net.NoRouteToHostException is not considered as a failure and the client keep retrying the same namenode, while what I'm expecting is to try with the next in the list...

16/01/13 08:30:21 DEBUG RetryUtils: RETRY 0) policy=MultipleLinearRandomRetry, exception=java.net.NoRouteToHostException: No route to host
16/01/13 08:30:23 INFO Client: Retrying connect to server: daplab-rt-11.fri.lan/10.10.10.111:8020. Already tried 0 time(s); retry policy is RetryPolicy[MultipleLinearRandomRetry[500x2000ms], TryOnceThenFail]
16/01/13 08:30:24 DEBUG RetryUtils: RETRY 1) policy=MultipleLinearRandomRetry, exception=java.net.NoRouteToHostException: No route to host
16/01/13 08:30:25 INFO Client: Retrying connect to server: daplab-rt-11.fri.lan/10.10.10.111:8020. Already tried 1 time(s); retry policy is RetryPolicy[MultipleLinearRandomRetry[500x2000ms], TryOnceThenFail]

Active Namenode logs:

2016-01-13 07:30:42,090 INFO  hdfs.StateChange (FSNamesystem.java:saveAllocatedBlock(3583)) - BLOCK* allocate blk_1073888940_148127{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-913cec22-84a9-435b-9941-3839a5870b6f:NORMAL:10.10.10.15:50010|RBW], ReplicaUC[[DISK]DS-c982ff94-8e74-4d03-a74c-4b81b48840b5:NORMAL:10.10.10.25:50010|RBW], ReplicaUC[[DISK]DS-34bc38b0-7eec-4521-b756-bfcaaf528f42:NORMAL:10.10.10.23:50010|RBW]]} for /shared/zefix/company/2014/08/18/company_2014-08-18.avro
2016-01-13 07:30:56,522 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36246 Total time for transactions(ms): 790 Number of transactions batched in Syncs: 568 Number of syncs: 30797 SyncTimes(ms): 311448 238317
2016-01-13 07:32:54,882 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36253 Total time for transactions(ms): 790 Number of transactions batched in Syncs: 568 Number of syncs: 30804 SyncTimes(ms): 311526 238382
2016-01-13 07:33:54,901 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36259 Total time for transactions(ms): 790 Number of transactions batched in Syncs: 568 Number of syncs: 30810 SyncTimes(ms): 311593 238438
2016-01-13 07:34:34,561 INFO  hdfs.StateChange (FSNamesystem.java:saveAllocatedBlock(3583)) - BLOCK* allocate blk_1073888941_148128{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-0112cfbf-aaca-4294-ae0d-2456935db0b6:NORMAL:10.10.10.26:50010|RBW], ReplicaUC[[DISK]DS-8170fc01-c5dc-4221-9ea2-1f4a3b343b49:NORMAL:10.10.10.32:50010|RBW], ReplicaUC[[DISK]DS-55ab7112-fc8a-428c-a948-cb6ee4efaefe:NORMAL:10.10.10.167:50010|RBW]]} for /user/daplab_prod/.sparkStaging/application_1452430297721_0098/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar
2016-01-13 07:34:35,894 INFO  hdfs.StateChange (FSNamesystem.java:saveAllocatedBlock(3583)) - BLOCK* allocate blk_1073888942_148129{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-c739e4e5-1545-40b9-a6ef-28989c6cc8eb:NORMAL:10.10.10.32:50010|RBW], ReplicaUC[[DISK]DS-25feb56a-94b7-4520-92dc-c46939414f9a:NORMAL:10.10.10.15:50010|RBW], ReplicaUC[[DISK]DS-31fb3172-c085-43ef-8ccd-f38e72492ad7:NORMAL:10.10.10.23:50010|RBW]]} for /user/daplab_prod/.sparkStaging/application_1452430297721_0098/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar
2016-01-13 07:34:36,910 INFO  hdfs.StateChange (FSNamesystem.java:completeFile(3503)) - DIR* completeFile: /user/daplab_prod/.sparkStaging/application_1452430297721_0098/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar is closed by DFSClient_NONMAPREDUCE_186954515_1
2016-01-13 07:34:40,078 INFO  hdfs.StateChange (FSNamesystem.java:saveAllocatedBlock(3583)) - BLOCK* allocate blk_1073888943_148130{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-0daa5887-d09e-4e96-888f-757e9c413581:NORMAL:10.10.10.23:50010|RBW], ReplicaUC[[DISK]DS-d3cb3d03-df7d-4211-8b9f-fe95b0e5954f:NORMAL:10.10.10.30:50010|RBW], ReplicaUC[[DISK]DS-09829b26-21f2-4c8b-9936-d69d9a0469a9:NORMAL:10.10.10.32:50010|RBW]]} for /user/daplab_prod/.sparkStaging/application_1452430297721_0098/__spark_conf__8474002130224342241.zip
2016-01-13 07:34:40,118 INFO  hdfs.StateChange (FSNamesystem.java:completeFile(3503)) - DIR* completeFile: /user/daplab_prod/.sparkStaging/application_1452430297721_0098/__spark_conf__8474002130224342241.zip is closed by DFSClient_NONMAPREDUCE_186954515_1
2016-01-13 07:34:54,937 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36286 Total time for transactions(ms): 790 Number of transactions batched in Syncs: 568 Number of syncs: 30831 SyncTimes(ms): 311815 238646
2016-01-13 07:35:54,963 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36292 Total time for transactions(ms): 790 Number of transactions batched in Syncs: 568 Number of syncs: 30837 SyncTimes(ms): 311862 238700
2016-01-13 07:36:55,010 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36298 Total time for transactions(ms): 790 Number of transactions batched in Syncs: 568 Number of syncs: 30843 SyncTimes(ms): 311908 238750
2016-01-13 07:37:22,743 INFO  hdfs.StateChange (FSNamesystem.java:completeFile(3503)) - DIR* completeFile: /shared/zefix/company/2014/08/18/company_2014-08-18.avro is closed by DFSClient_NONMAPREDUCE_-1471796874_1
2016-01-13 07:37:54,939 INFO  hdfs.StateChange (FSNamesystem.java:saveAllocatedBlock(3583)) - BLOCK* allocate blk_1073888944_148131{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-36be5f99-94df-4539-b8e4-c6fcd07cca53:NORMAL:10.10.10.14:50010|RBW], ReplicaUC[[DISK]DS-49d7ed47-bbaf-477c-956e-54cc5dd5d165:NORMAL:10.10.10.24:50010|RBW], ReplicaUC[[DISK]DS-b0436f14-fe85-492d-999e-3f2fe44dd638:NORMAL:10.10.10.32:50010|RBW]]} for /shared/zefix/company/2014/02/14/company_2014-02-14.avro
2016-01-13 07:37:55,010 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36311 Total time for transactions(ms): 790 Number of transactions batched in Syncs: 568 Number of syncs: 30854 SyncTimes(ms): 312082 238850
2016-01-13 07:38:04,840 INFO  hdfs.StateChange (FSNamesystem.java:saveAllocatedBlock(3583)) - BLOCK* allocate blk_1073888945_148132{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-ea1ff4e9-53e5-4e8e-8c02-4a6a5993247a:NORMAL:10.10.10.31:50010|RBW], ReplicaUC[[DISK]DS-7c91c6c5-a0b5-46a6-a963-097150f29878:NORMAL:10.10.10.30:50010|RBW], ReplicaUC[[DISK]DS-d4e2030b-c29d-4e0c-a3b6-3e8b795d3b2e:NORMAL:10.10.10.13:50010|RBW]]} for /app-logs/daplab_prod/logs/application_1452430297721_0098/daplab-wn-31.fri.lan_45454.tmp
2016-01-13 07:38:04,882 INFO  hdfs.StateChange (FSNamesystem.java:completeFile(3503)) - DIR* completeFile: /app-logs/daplab_prod/logs/application_1452430297721_0098/daplab-wn-31.fri.lan_45454.tmp is closed by DFSClient_NONMAPREDUCE_-829750558_12726
2016-01-13 07:38:06,441 INFO  hdfs.StateChange (FSNamesystem.java:saveAllocatedBlock(3583)) - BLOCK* allocate blk_1073888946_148133{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-d4e2030b-c29d-4e0c-a3b6-3e8b795d3b2e:NORMAL:10.10.10.13:50010|RBW], ReplicaUC[[DISK]DS-8ed05f08-aa49-43e8-b1ac-ca60d5cd6196:NORMAL:10.10.10.23:50010|RBW], ReplicaUC[[DISK]DS-dd4bdbf0-dc93-4e42-8009-dc13fe580385:NORMAL:10.10.10.31:50010|RBW]]} for /app-logs/daplab_prod/logs/application_1452430297721_0098/daplab-wn-13.fri.lan_45454.tmp
2016-01-13 07:38:06,490 INFO  hdfs.StateChange (FSNamesystem.java:completeFile(3503)) - DIR* completeFile: /app-logs/daplab_prod/logs/application_1452430297721_0098/daplab-wn-13.fri.lan_45454.tmp is closed by DFSClient_NONMAPREDUCE_1188492760_4267
2016-01-13 07:38:06,827 INFO  hdfs.StateChange (FSNamesystem.java:saveAllocatedBlock(3583)) - BLOCK* allocate blk_1073888947_148134{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-4c7faccb-83ab-4dbd-a763-f35690bd3767:NORMAL:10.10.10.32:50010|RBW], ReplicaUC[[DISK]DS-0b3b0051-e971-44a4-914c-14ef84528e05:NORMAL:10.10.10.30:50010|RBW], ReplicaUC[[DISK]DS-c26f498e-77dc-4d02-8abc-b1ebc9d25946:NORMAL:10.10.10.23:50010|RBW]]} for /app-logs/daplab_prod/logs/application_1452430297721_0098/daplab-wn-32.fri.lan_45454.tmp
2016-01-13 07:38:06,865 INFO  hdfs.StateChange (FSNamesystem.java:completeFile(3503)) - DIR* completeFile: /app-logs/daplab_prod/logs/application_1452430297721_0098/daplab-wn-32.fri.lan_45454.tmp is closed by DFSClient_NONMAPREDUCE_-1454565051_8117
2016-01-13 07:38:55,024 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36336 Total time for transactions(ms): 790 Number of transactions batched in Syncs: 568 Number of syncs: 30873 SyncTimes(ms): 312254 239031
2016-01-13 07:39:55,046 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36342 Total time for transactions(ms): 791 Number of transactions batched in Syncs: 568 Number of syncs: 30879 SyncTimes(ms): 312313 239091
2016-01-13 07:40:55,061 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36349 Total time for transactions(ms): 792 Number of transactions batched in Syncs: 568 Number of syncs: 30886 SyncTimes(ms): 312389 239147
2016-01-13 07:41:55,078 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36356 Total time for transactions(ms): 792 Number of transactions batched in Syncs: 568 Number of syncs: 30893 SyncTimes(ms): 312468 239203
2016-01-13 07:42:55,092 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36362 Total time for transactions(ms): 792 Number of transactions batched in Syncs: 568 Number of syncs: 30899 SyncTimes(ms): 312544 239259
2016-01-13 07:43:56,485 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36370 Total time for transactions(ms): 792 Number of transactions batched in Syncs: 568 Number of syncs: 30907 SyncTimes(ms): 312629 239343
2016-01-13 07:44:56,565 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36376 Total time for transactions(ms): 794 Number of transactions batched in Syncs: 568 Number of syncs: 30913 SyncTimes(ms): 312690 239390
2016-01-13 07:45:42,857 INFO  hdfs.StateChange (FSNamesystem.java:completeFile(3503)) - DIR* completeFile: /shared/zefix/company/2014/02/14/company_2014-02-14.avro is closed by DFSClient_NONMAPREDUCE_-1471796874_1
2016-01-13 07:45:56,612 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36384 Total time for transactions(ms): 794 Number of transactions batched in Syncs: 568 Number of syncs: 30921 SyncTimes(ms): 312771 239458
2016-01-13 07:46:05,715 INFO  hdfs.StateChange (FSNamesystem.java:saveAllocatedBlock(3583)) - BLOCK* allocate blk_1073888948_148135{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-58bd0cf4-a027-4e4c-aa86-fbf49d8ea6f6:NORMAL:10.10.10.24:50010|RBW], ReplicaUC[[DISK]DS-f93cd04c-c780-44fe-91b7-9fbb214c9724:NORMAL:10.10.10.32:50010|RBW], ReplicaUC[[DISK]DS-5809f6ca-32bb-44d1-8367-44ce5a9b6fff:NORMAL:10.10.10.33:50010|RBW]]} for /shared/zefix/company/2014/09/04/company_2014-09-04.avro
2016-01-13 07:47:54,928 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36394 Total time for transactions(ms): 794 Number of transactions batched in Syncs: 568 Number of syncs: 30929 SyncTimes(ms): 312853 239532
2016-01-13 07:48:54,949 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36401 Total time for transactions(ms): 794 Number of transactions batched in Syncs: 568 Number of syncs: 30936 SyncTimes(ms): 312950 239591
2016-01-13 07:49:54,979 INFO  namenode.FSEditLog (FSEditLog.java:printStatistics(699)) - Number of transactions: 36407 Total time for transactions(ms): 794 Number of transactions batched in Syncs: 568 Number of syncs: 30942 SyncTimes(ms): 313009 239649

avatar
Contributor

@Artem Ervits (I'm new to the community forum, not sure if you receive notification to every answers)

avatar

java.net.NoRouteToHostException is considered a failure that can be recovered from in any deployment with floating IP addresses. This was essentially the sole form of failover in Hadoop pre NN-HA (HADOOP-6667 added the check). I think we ought to revisit that decision

avatar
Contributor

@stevel what is your advise to around this behavior?

avatar
Super Collaborator

What are the contents of your hdfs-site.xml and core-site.xml on the machine running spark-shell?

avatar
Master Mentor

@bikas question is closed are you having same problems? Open a new thread in that case.

avatar
New Contributor

@Benoit Perroud did you ever find a solution? I am having the same issue.

Thanks, Dirk