Support Questions

Find answers, ask questions, and share your expertise

Ambari HDP 2.6 multinode Installation fails at last step

New Contributor

I'm trying to install HDP 2.6 on 16 nodes (2 master nodes and 14 slaves) using Ambari wizard. After many challenges it is completed, but not enough successfully - all nodes get orange with "Warnings encountered". All slave nodes have only warning related to NodeManager Start

I'd highly appreciate any help on fixing this issue.

Here are some outputs and error messages from different nodes:

==== MasterNode1

--- Check HDFS

stdout: /var/lib/ambari-agent/data/output-189.txt (last notice)

2017-12-12 04:10:41,598 - HdfsResource[None] {'security_enabled': False, 'hadoop_bin_dir': '/usr/hdp/2.6.3.0-235/hadoop/bin', 'keytab': [EMPTY], 'dfs_type': '', 'default_fs': 'hdfs://nnode.cedar.cluster.ada:8020', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': 'kinit', 'principal_name': None, 'user': 'hdfs', 'action': ['execute'], 'hadoop_conf_dir': '/usr/hdp/2.6.3.0-235/hadoop/conf', 'immutable_paths': [u'/mr-history/done', u'/app-logs', u'/tmp']} Command completed successfully!

--- Grafana

Start Errors and Output files empty

==== MasterNode2

--- Metrics Collector Start

stderr: /var/lib/ambari-agent/data/errors-174.txt (last notice)

File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 120, in action_create raise Fail("Applying %s failed, parent directory %s doesn't exist" % (self.resource, dirname)) resource_management.core.exceptions.Fail: Applying File['/usr/lib/ams-hbase/bin/hadoop'] failed, parent directory /usr/lib/ams-hbase/bin doesn't exist

--- Activity Analyzer

Start Errors and Output files empty

--- Activity Explorer Start

Errors and Output files empty

--- Check MapReduce2

Errors and Output files empty

==== All DataNodes - same warning on all (14) of them

--- NodeManager Start

stderr: /var/lib/ambari-agent/data/errors-181.txt Command aborted. Reason: 'Server considered task failed and automatically aborted it'

stdout: /var/lib/ambari-agent/data/output-181.txt (last notice) 2017-12-12 04:10:41,326 - Execute['ulimit -c unlimited; export HADOOP_LIBEXEC_DIR=/usr/hdp/2.6.3.0-235/hadoop/libexec && /usr/hdp/2.6.3.0-235/hadoop-yarn/sbin/yarn-daemon.sh --config /usr/hdp/2.6.3.0-235/hadoop/conf start nodemanager'] {'not_if': 'ambari-sudo.sh -H -E test -f /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid && ambari-sudo.sh -H -E pgrep -F /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid', 'user': 'yarn'} Command aborted. Reason: 'Server considered task failed and automatically aborted it' Command failed after 1 tries

2 REPLIES 2

Mentor

@Abzetdin Adamov

Can you manually start the node manager

su -l yarn -c "/usr/hdp/current/hadoop-yarn-nodemanager/sbin/yarn-daemon.sh start nodemanager"

Can you also check whether this directory exists

/usr/lib/ams-hbase/

New Contributor

thank you for response

1) node manager is running

2) this folder does not exit not in master not in slaves

Do you think it will help if I create it

Additionally on Ambari UI -

Metrics Collector Process Connection failed: [Errno 111] Connection refused to....