I have installed HDP-3.0.1 on a new ubuntu 16.04 AWS EC2 cluster. In Ambari I set up a basic cluster (HDFS, ZK, YARN, MR, Hive, Oozie, Ambari Metrics, Infra, Spark, Zeppelin, Tez.). All services start correctly with no issues. I then set HDFS HA mode (nameservice1). I also set Yarn to HA mode. I restarted the cluster and all is fine. (i.e. One NN is active the other is in standby as expected. Both ZKFAILOVERCONTROLLERs are running.) In Ambari I then set a HDFS custom core-site.xml property called 'hadoop.security.credential.provider.path' with value 'jceks://hdfs@nameservice1/user/common/hdfs.jceks' See screenshot. Have also tried with setting jceks://hdfs/user/common/hdfs.jceks but this results in same issue. It doesn't make a difference if this file exists on hdfs or not. I restart HDFS for the property to take effect. The service starts but usually both NNs are stuck in Standby and the ZKFAILOVERCONTROLLERs eventually stop. (see screenshot) (Very occasionally HDFS does start correctly but on restart usually fails.) I have reproduced this issue with HDP3.0.0 on ubuntu 16..04 and RHEL 7 too. If I remove this property and restart HDFS, the service starts correctly. If I enter a dummy property e.g. prop1 = foo the service starts correctly. For background, this property is set on two of our HDP-2.6.5 HA clusters and works correctly. (Its used for securing credentials https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html) Why is it not working on HDP3.0.0/3.0.1? There are no obvious errors in the HDFS logs (attached).
... View more