Support Questions

Find answers, ask questions, and share your expertise

ambari cluster not working, error in history server

avatar
Rising Star

im based on http://public-repo-1.hortonworks.com/ambari/ubuntu16/2.x/updates/2.5.2.0/ambari.listr

selected spark2 and all its required dependencies

the following services have an error:

  • History Server - Connection failed: [Errno 111] Connection refused to ambari-agent1
  • Hive Metastore
  • HiveServer2

i receive the following error on manual starting history server

INFO 2017-10-10 04:57:10,565 logger.py:75 - Testing the JVM's JCE policy to see it if supports an unlimited key length.
INFO 2017-10-10 04:57:10,565 logger.py:75 - Testing the JVM's JCE policy to see it if supports an unlimited key length.
INFO 2017-10-10 04:57:10,681 Hardware.py:176 - Some mount points were ignored: /dev, /run, /, /dev/shm, /run/lock, /sys/fs/cgroup, /boot, /home, /run/user/108, /run/user/1007, /run/user/1005, /run/user/1010, /run/user/1011, /run/user/1012, /run/user/1001
INFO 2017-10-10 04:57:10,682 Controller.py:320 - Sending Heartbeat (id = 4066)
INFO 2017-10-10 04:57:10,688 Controller.py:333 - Heartbeat response received (id = 4067)
INFO 2017-10-10 04:57:10,688 Controller.py:342 - Heartbeat interval is 1 seconds
INFO 2017-10-10 04:57:10,688 Controller.py:380 - Updating configurations from heartbeat
INFO 2017-10-10 04:57:10,688 Controller.py:389 - Adding cancel/execution commands
INFO 2017-10-10 04:57:10,688 Controller.py:475 - Waiting 0.9 for next heartbeat
INFO 2017-10-10 04:57:11,589 Controller.py:482 - Wait for next heartbeat over
WARNING 2017-10-10 04:57:22,205 base_alert.py:138 - [Alert][namenode_hdfs_capacity_utilization] Unable to execute alert. division by zero
INFO 2017-10-10 04:57:27,060 ClusterConfiguration.py:119 - Updating cached configurations for cluster vqcluster
INFO 2017-10-10 04:57:27,071 Controller.py:249 - Adding 1 commands. Heartbeat id = 4085
INFO 2017-10-10 04:57:27,071 ActionQueue.py:113 - Adding EXECUTION_COMMAND for role SPARK2_JOBHISTORYSERVER for service SPARK2 of cluster vqcluster to the queue.
INFO 2017-10-10 04:57:27,081 ActionQueue.py:238 - Executing command with id = 68-0, taskId = 307 for role = SPARK2_JOBHISTORYSERVER of cluster vqcluster.
INFO 2017-10-10 04:57:27,081 ActionQueue.py:279 - Command execution metadata - taskId = 307, retry enabled = False, max retry duration (sec) = 0, log_output = True
WARNING 2017-10-10 04:57:27,083 CommandStatusDict.py:128 - [Errno 2] No such file or directory: '/var/lib/ambari-agent/data/output-307.txt'
INFO 2017-10-10 04:57:32,563 PythonExecutor.py:130 - Command ['/usr/bin/python',
 u'/var/lib/ambari-agent/cache/common-services/SPARK2/2.0.0/package/scripts/job_history_server.py',
 u'START',
 '/var/lib/ambari-agent/data/command-307.json',
 u'/var/lib/ambari-agent/cache/common-services/SPARK2/2.0.0/package',
 '/var/lib/ambari-agent/data/structured-out-307.json',
 'INFO',
 '/var/lib/ambari-agent/tmp',
 'PROTOCOL_TLSv1',
 ''] failed with exitcode=1
INFO 2017-10-10 04:57:32,577 log_process_information.py:40 - Command 'export COLUMNS=9999 ; ps faux' returned 0. USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
1 ACCEPTED SOLUTION

avatar
Master Mentor

@ilia kheifets

Sorry to hear you are encountering all these problems. Could you tell me the

HDP,Ambari and OS type and version you are trying to install.

I will try to guide you.

View solution in original post

32 REPLIES 32

avatar
Rising Star

I am doing a fresh install with root user
Ill give an update as soon as i finish

avatar
Master Mentor

@ilia kheifets

Its unfortunate we didn't drill down to resolve the issue but always remember to also install the ambari-agent on the Ambari server!

Keep me posted so I won't install and ubuntu 14 to resolve your problem.

avatar
Rising Star

just made a fresh install

ubuntu 16.04

  4.10.0-28-generic #32~16.04.2-Ubuntu SMP Thu Jul 20 10:19:48 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux 

server:

HDP-2.6.2.0
  • apt-get install ambari-server
  • apt-get install ambari-agent
  • ambari-setup -s

agent (2 agents based on ambari ssh auto install)

  • one agent with only services
  • one agent only with workers

issue same as above

INFO 2017-10-15 09:50:42,758 ClusterConfiguration.py:119 - Updating cached configurations for cluster vqcluster
INFO 2017-10-15 09:50:42,769 Controller.py:249 - Adding 1 commands. Heartbeat id = 1930
INFO 2017-10-15 09:50:42,769 ActionQueue.py:113 - Adding EXECUTION_COMMAND for role HISTORYSERVER for service MAPREDUCE2 of cluster vqcluster to the queue.
INFO 2017-10-15 09:50:42,805 ActionQueue.py:238 - Executing command with id = 23-0, taskId = 106 for role = HISTORYSERVER of cluster vqcluster.
INFO 2017-10-15 09:50:42,806 ActionQueue.py:279 - Command execution metadata - taskId = 106, retry enabled = False, max retry duration (sec) = 0, log_output = True
WARNING 2017-10-15 09:50:42,806 CommandStatusDict.py:128 - [Errno 2] No such file or directory: '/var/lib/ambari-agent/data/output-106.txt'
INFO 2017-10-15 09:50:43,873 PythonExecutor.py:130 - Command ['/usr/bin/python',
 u'/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/historyserver.py',
 u'START',
 '/var/lib/ambari-agent/data/command-106.json',
 u'/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package',
 '/var/lib/ambari-agent/data/structured-out-106.json',
 'INFO',
 '/var/lib/ambari-agent/tmp',
 'PROTOCOL_TLSv1',
 ''] failed with exitcode=1

server and client was installed and started via root

avatar
Master Mentor

@ilia kheifets

I don't see your updated thread though I have just received your update email. So you are still encountering the same problem.

What document did you use, if you have one can you share it.

Did you run the below on all the 3 nodes

# apt-get install ambari-agent

Can you confirm that this directory exists

/var/lib/ambari-agent/data/output-106.txt

Please revert

avatar
Rising Star
 # apt-get install ambari-agent

only on server , i think user must install on each agent and register it only when using non root install (not auto vi ssh)

<em>/var/lib/ambari-agent/data/output-106.txt</em>

folder exists, but file does not

what did you mean by revert?

i tried multiple guides (for default settings all of them are exactly the same), but i just made a simple (next->next install with all default settings..)

for instance

avatar
Master Mentor

@ilia kheifets

revert justmeans get back to me I have just downloaded an Ubuntu 16.0.4 let me try it out.

The ambari-agent can be installed NOT only when using non-root but if you don't have passwordless configuration between the nodes.

# apt-get install ambari-agent

Give me some time

avatar
Master Mentor

@ilia kheifets

As promised I tried to recreate your environment on a 1 node cluster running Ubuntu 16.04 in a VM with 12 GB RAM.After the installations find below the high-level steps I performed.

The Cluster is installing without any issues

Generate public and private SSH keys on the Ambari Server host.

root@ubuntu17:~# ssh-keygen 

Accepted default values,copy the SSH Public Key (id_rsa.pub) to the root account on your target hosts if multi-node.

.ssh/id_rsa

.ssh/id_rsa.pub

Add the SSH Public Key to the authorized_keys file on your target hosts.

root@ubuntu17:~# cat id_rsa.pub >> authorized_keys 
root@ubuntu17:~# chmod 700 ~/.ssh 
root@ubuntu17:~# chmod 600 ~/.ssh/authorized_keys 

Set NTPD

root@ubuntu17:~# apt-get install ntp 
root@ubuntu17:~# update-rc.d ntp defaults 

Set the host FQDN in my case the Ambari server vi /etc/hosts ---- IP--FDN--ALIAS

 192.168.0.172 ubuntu17.kenya.com ubuntu17 

Edit host name

root@ubuntu17:~# vi /etc/hostname 
ubuntu17.kenya.com

Firewall settings

root@ubuntu17:~# sudo ufw disable 
The program 'setenforce' is currently not installed.So I installed

The program 'setenforce' is currently not installed.So I installed

root@ubuntu17:~# apt install selinux-utils 

Disable SE-Linux

root@ubuntu17:~# setenforce 0 

Set UMASK

root@ubuntu17:~# umask 0022 
root@ubuntu17:~# echo umask 0022 >> /etc/profile 

Download Ambari repo files

root@ubuntu17:~# wget -O /etc/apt/sources.list.d/ambari.list http://public-repo-1.hortonworks.com/ambari/ubuntu16/2.x/updates/2.5.2.0/ambari.list 

Grab the key

root@ubuntu17:~# apt-key adv --recv-keys --keyserver keyserver.ubuntu.com B9733A7A07513CAD 

Update

root@ubuntu17:~# apt-get update 
root@ubuntu17:~# apt-cache showpkg ambari-server 
root@ubuntu17:~# apt-cache showpkg ambari-agent 
root@ubuntu17:~# apt-cache showpkg ambari-metrics-assembly 

Install Java Openjdk comes with JCE

root@ubuntu17:~# sudo apt-get install openjdk-8-jdk 

Install Ambari server here I used the root user

root@ubuntu17:~# apt-get install ambari-server 
root@ubuntu17:~# ambari-server setup 

Setup as root user with default postgres databases

root@ubuntu17:~# ambari-server start 
Ambari Server 'start' completed successfully. 

Get the FQDN for the ambari server

root@ubuntu17:~# hostname -f 
ubuntu17.kenya.com 

So my Ambari URL will be http://ubuntu17.kenya.com:8080

See attached screenshot,got an error with THP so had to disable

root@ubuntu17:~/.ssh# cat /sys/kernel/mm/transparent_hugepage/enabled [always] madvise never 
root@ubuntu17:~/.ssh# echo never > /sys/kernel/mm/transparent_hugepage/enabled 
root@ubuntu17:~/.ssh# echo never > /sys/kernel/mm/transparent_hugepage/defrag 
root@ubuntu17:~/.ssh# cat /sys/kernel/mm/transparent_hugepage/enabled always madvise [never]

Setup Mysql database for hive ,oozie etc

root@ubuntu17:~# sudo apt-get update 
root@ubuntu17:~# sudo apt-get install mysql-server 
root@ubuntu17:~# sudo mysql_secure_installation 
root@ubuntu17:~# apt-get install -y libpostgresql-jdbc-java 
root@ubuntu17:~# apt-get install libmysql-java 
root@ubuntu17:~# ls /usr/share/java/mysql-connector-java.jar 

Check that the mysql connector is present at that location

root@ubuntu17:~# ls /usr/share/java/mysql-connector-java.jar

Create the hive,oozie and ranger databases in Mysql with the correct privileges

created database for hive successfully

created database for oozie successfully

created database for ranger successfully

Run this command

# ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar

Proceeded with the cluster installation

http://ubuntu17.kenya.com:8080

See attached screenshots


capture4.jpgcapture3.jpgcapture1.jpgcapture-thp-5.jpgcapture2.jpg

avatar
Master Mentor

subsequent screenshots


capture10.jpgcapture7.jpgcapture6.jpgcapture9.jpgcapture8.jpg

avatar
Rising Star

i am waiting to see the outcome because most of the steps i made are exactly the same

exceptions:

  • Firewall setting - i did not disable cause the fw is not installed on the vm
  • java and sql- i set default so the ambari-server setup should get its default java and its own sql db
  • THP - had the error, but it only effects on performance, so i did not solve it on my use case, i wanted a working system at first
  • i used less services. i selected only spark2 and when pressing next enabled all its dependencies and required services
  • i used 2 nodes, one for all the service. and one only for clients\workes

avatar
Rising Star

while one of my post is being moderated ill ask a another question:

why did you set a custom sql for hive ,oozie etc