Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Services are not starting after installation of HDP AWS Ubuntu Cluster

Explorer

Background: first time installing hadoop on a cluster single-handedly. AWS 3 instances, Ubuntu 16, 1 XL node 2 L nodes, all with 40 Gb storage each. Latest version of HDP 2.6.4. Installed successfully, started ambari successfully, logged into web admin portal ... Followed all the steps in the documentation.

Problem: most services failed to start ("heartbeat lost"), 39 alerts. Tried starting services manually via the web portal, tried restarting the ambari server, did not help. Below are some of the alerts:

HDFS NameNode Web UI:

Connection failed to http://ec2-18-217-xxxx.us-east-2.compute.amazonaws.com:50070 (<urlopen error [Errno 111] Connection refused>)

Yarn App Timeline Web UI:

Connection failed to http://ec2-18-218-xxxx.us-east-2.compute.amazonaws.com:8188/ws/v1/timeline (<urlopen error [Errno 111] Connection refused>) CRIT

MapReduce2 History Server Process

Connection failed: [Errno 111] Connection refused to ec2-18-218-xxxx.us-east-2.compute.amazonaws.com:19888

... and many more

8 REPLIES 8

Mentor

@Michael O

See this link should help you with the HDP AWS connectivity

https://community.hortonworks.com/answers/105662/view.html

Hope that helps

Explorer

@Geoffrey Shelton Okot

Thanks, taking a look.

Explorer

@Geoffrey Shelton Okot Reviewed the link. Thank you for taking a look at this.

Best answer in the link just provides a link to general AWS documentation on Elastic IP and VPC DNS. I did not find reasons nor solutions for the issue I detailed here.

I've assigned elastic IPs to my cluster prior to the installation of the Ambari. Installation completed with no issues. Services are not starting due to failed connections, some of which are detailed in my original post. Same elastic IPs have been assigned to the cluster since its launch.

Is there a way to trouble-shoot these connections? Why the setup did not configure these connections automatically?

Is there a way/steps to verify that everything is properly configured on the cluster for the above connections to work?

Mentor

@Michael O

I am interested to know what's the contents of your /etc/hosts?

Did you configure the passwordless connect between your Ambari-server and all the hosts in the cluster?

Explorer

@Geoffrey Shelton Okot thanks for responding and giving the pointers.

I did configure and test the passwordless connect between the hosts in the cluster. I can ssh to any host, from any host on the cluster without entering the password.

I did folllow the procedure from the installation guide for setting up /etc/hosts, but I will post the contents of /etc/hosts later today.

Mentor

@Michael O

Just to eliminate again some doubts did you install the ambari-agents using ambari or manually if the later can you check that the ambari-agent.ini has the correct entry for the ambari-server (FQDN)?
In your /etc/hosts are you using the private of public IP's can the ip's/hostnames be resolved by DNS ?

Explorer

@Geoffrey Shelton Okot

I've installed everything (assuming agents as well) using ambari. I have not done any manual setup, to my knowledge. Below are the contents the /etc/hosts and /etc/ambari-agent/conf/ambari-agent.ini

Contents of /etc/hosts (I've masked last two numbers here with x for security):

127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts

::1 ip6-localhost ip6-loopback

fe00::0 ip6-localnet

ff00::0 ip6-mcastprefix

ff02::1 ip6-allnodes

ff02::2 ip6-allrouters

ff02::3 ip6-allhosts

18.217.xxx.xx ec2-18-217-xxx-xx.us-east-2.compute.amazonaws.com

18.218.xxx.xx ec2-18-218-xxx-xx.us-east-2.compute.amazonaws.com

52.15.xxx.xx ec2-52-15-xxx-xx.us-east-2.compute.amazonaws.com

========================================

Contents of /etc/ambari-agent/conf/ambari-agent.ini (host name below is set to the aws private IP that the 1st elastic IP in /etc/hosts above is pointing to, 18.217.xxx.xx, I have not set it anywhere myself, so this must have been determined by ambari setup process automatically. What do you recommend? Should I manually change it to the elastic IP from /etc/hosts, on each node? Any other files to check?):

[server]

hostname=ip-172-31-10-118.us-east-2.compute.internal

url_port=8440

secured_url_port=8441

connect_retry_delay=10

max_reconnect_retry_delay=30

[agent]

logdir=/var/log/ambari-agent

piddir=/var/run/ambari-agent

prefix=/var/lib/ambari-agent/data

;loglevel=(DEBUG/INFO)

loglevel=INFO

data_cleanup_interval=86400

data_cleanup_max_age=2592000

data_cleanup_max_size_MB =100

ping_port=8670

cache_dir=/var/lib/ambari-agent/cache

tolerate_download_failures=true

run_as_user=root

parallel_execution=0

alert_grace_period=5

status_command_timeout=5

alert_kinit_timeout=14400000

system_resource_overrides=/etc/resource_overrides

; memory_threshold_soft_mb=400

; memory_threshold_hard_mb=1000

; ignore_mount_points=/mnt/custom1,/mnt/custom2

[security]

keysdir=/var/lib/ambari-agent/keys

server_crt=ca.crt

passphrase_env_var_name=AMBARI_PASSPHRASE

ssl_verify_cert=0

credential_lib_dir=/var/lib/ambari-agent/cred/lib

credential_conf_dir=/var/lib/ambari-agent/cred/conf

credential_shell_cmd=org.apache.hadoop.security.alias.CredentialShell

[network]

; this option apply only for Agent communication

use_system_proxy_settings=true

[services]

pidLookupPath=/var/run/

[heartbeat]

state_interval_seconds=60

dirs=/etc/hadoop,/etc/hadoop/conf,/etc/hbase,/etc/hcatalog,/etc/hive,/etc/oozie,

/etc/sqoop,

/var/run/hadoop,/var/run/zookeeper,/var/run/hbase,/var/run/templeton,/var/run/oozie,

/var/log/hadoop,/var/log/zookeeper,/var/log/hbase,/var/run/templeton,/var/log/hive

; 0 - unlimited

log_lines_count=300

idle_interval_min=1

idle_interval_max=10

[logging]

syslog_enabled=0

Explorer

Changed ambari-agent.ini on all the nodes, replaced the private IP of the master node with elastic IP of the master node. Stopped and started the ambari server. This did not help, same issue continues. Issue is not resolved.

Any other checks/trouble-shooting options?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.