Created 10-10-2017 09:49 AM
im based on http://public-repo-1.hortonworks.com/ambari/ubuntu16/2.x/updates/2.5.2.0/ambari.listr
selected spark2 and all its required dependencies
the following services have an error:
i receive the following error on manual starting history server
INFO 2017-10-10 04:57:10,565 logger.py:75 - Testing the JVM's JCE policy to see it if supports an unlimited key length. INFO 2017-10-10 04:57:10,565 logger.py:75 - Testing the JVM's JCE policy to see it if supports an unlimited key length. INFO 2017-10-10 04:57:10,681 Hardware.py:176 - Some mount points were ignored: /dev, /run, /, /dev/shm, /run/lock, /sys/fs/cgroup, /boot, /home, /run/user/108, /run/user/1007, /run/user/1005, /run/user/1010, /run/user/1011, /run/user/1012, /run/user/1001 INFO 2017-10-10 04:57:10,682 Controller.py:320 - Sending Heartbeat (id = 4066) INFO 2017-10-10 04:57:10,688 Controller.py:333 - Heartbeat response received (id = 4067) INFO 2017-10-10 04:57:10,688 Controller.py:342 - Heartbeat interval is 1 seconds INFO 2017-10-10 04:57:10,688 Controller.py:380 - Updating configurations from heartbeat INFO 2017-10-10 04:57:10,688 Controller.py:389 - Adding cancel/execution commands INFO 2017-10-10 04:57:10,688 Controller.py:475 - Waiting 0.9 for next heartbeat INFO 2017-10-10 04:57:11,589 Controller.py:482 - Wait for next heartbeat over WARNING 2017-10-10 04:57:22,205 base_alert.py:138 - [Alert][namenode_hdfs_capacity_utilization] Unable to execute alert. division by zero INFO 2017-10-10 04:57:27,060 ClusterConfiguration.py:119 - Updating cached configurations for cluster vqcluster INFO 2017-10-10 04:57:27,071 Controller.py:249 - Adding 1 commands. Heartbeat id = 4085 INFO 2017-10-10 04:57:27,071 ActionQueue.py:113 - Adding EXECUTION_COMMAND for role SPARK2_JOBHISTORYSERVER for service SPARK2 of cluster vqcluster to the queue. INFO 2017-10-10 04:57:27,081 ActionQueue.py:238 - Executing command with id = 68-0, taskId = 307 for role = SPARK2_JOBHISTORYSERVER of cluster vqcluster. INFO 2017-10-10 04:57:27,081 ActionQueue.py:279 - Command execution metadata - taskId = 307, retry enabled = False, max retry duration (sec) = 0, log_output = True WARNING 2017-10-10 04:57:27,083 CommandStatusDict.py:128 - [Errno 2] No such file or directory: '/var/lib/ambari-agent/data/output-307.txt' INFO 2017-10-10 04:57:32,563 PythonExecutor.py:130 - Command ['/usr/bin/python', u'/var/lib/ambari-agent/cache/common-services/SPARK2/2.0.0/package/scripts/job_history_server.py', u'START', '/var/lib/ambari-agent/data/command-307.json', u'/var/lib/ambari-agent/cache/common-services/SPARK2/2.0.0/package', '/var/lib/ambari-agent/data/structured-out-307.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1', ''] failed with exitcode=1 INFO 2017-10-10 04:57:32,577 log_process_information.py:40 - Command 'export COLUMNS=9999 ; ps faux' returned 0. USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
Created 10-15-2017 07:03 AM
Sorry to hear you are encountering all these problems. Could you tell me the
HDP,Ambari and OS type and version you are trying to install.
I will try to guide you.
Created 10-15-2017 10:07 AM
I am doing a fresh install with root user
Ill give an update as soon as i finish
Created 10-15-2017 10:35 AM
Its unfortunate we didn't drill down to resolve the issue but always remember to also install the ambari-agent on the Ambari server!
Keep me posted so I won't install and ubuntu 14 to resolve your problem.
Created 10-15-2017 01:55 PM
just made a fresh install
ubuntu 16.04
4.10.0-28-generic #32~16.04.2-Ubuntu SMP Thu Jul 20 10:19:48 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
server:
HDP-2.6.2.0 |
agent (2 agents based on ambari ssh auto install)
issue same as above
INFO 2017-10-15 09:50:42,758 ClusterConfiguration.py:119 - Updating cached configurations for cluster vqcluster INFO 2017-10-15 09:50:42,769 Controller.py:249 - Adding 1 commands. Heartbeat id = 1930 INFO 2017-10-15 09:50:42,769 ActionQueue.py:113 - Adding EXECUTION_COMMAND for role HISTORYSERVER for service MAPREDUCE2 of cluster vqcluster to the queue. INFO 2017-10-15 09:50:42,805 ActionQueue.py:238 - Executing command with id = 23-0, taskId = 106 for role = HISTORYSERVER of cluster vqcluster. INFO 2017-10-15 09:50:42,806 ActionQueue.py:279 - Command execution metadata - taskId = 106, retry enabled = False, max retry duration (sec) = 0, log_output = True WARNING 2017-10-15 09:50:42,806 CommandStatusDict.py:128 - [Errno 2] No such file or directory: '/var/lib/ambari-agent/data/output-106.txt' INFO 2017-10-15 09:50:43,873 PythonExecutor.py:130 - Command ['/usr/bin/python', u'/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/historyserver.py', u'START', '/var/lib/ambari-agent/data/command-106.json', u'/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package', '/var/lib/ambari-agent/data/structured-out-106.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1', ''] failed with exitcode=1
server and client was installed and started via root
Created 10-15-2017 02:14 PM
I don't see your updated thread though I have just received your update email. So you are still encountering the same problem.
What document did you use, if you have one can you share it.
Did you run the below on all the 3 nodes
# apt-get install ambari-agent
Can you confirm that this directory exists
/var/lib/ambari-agent/data/output-106.txt
Please revert
Created 10-15-2017 02:25 PM
# apt-get install ambari-agent
only on server , i think user must install on each agent and register it only when using non root install (not auto vi ssh)
<em>/var/lib/ambari-agent/data/output-106.txt</em>
folder exists, but file does not
what did you mean by revert?
i tried multiple guides (for default settings all of them are exactly the same), but i just made a simple (next->next install with all default settings..)
for instance
Created 10-15-2017 02:57 PM
revert justmeans get back to me I have just downloaded an Ubuntu 16.0.4 let me try it out.
The ambari-agent can be installed NOT only when using non-root but if you don't have passwordless configuration between the nodes.
# apt-get install ambari-agent
Give me some time
Created 10-15-2017 09:03 PM
As promised I tried to recreate your environment on a 1 node cluster running Ubuntu 16.04 in a VM with 12 GB RAM.After the installations find below the high-level steps I performed.
The Cluster is installing without any issues
Generate public and private SSH keys on the Ambari Server host.
root@ubuntu17:~# ssh-keygen
Accepted default values,copy the SSH Public Key (id_rsa.pub) to the root account on your target hosts if multi-node.
.ssh/id_rsa
.ssh/id_rsa.pub
Add the SSH Public Key to the authorized_keys file on your target hosts.
root@ubuntu17:~# cat id_rsa.pub >> authorized_keys root@ubuntu17:~# chmod 700 ~/.ssh root@ubuntu17:~# chmod 600 ~/.ssh/authorized_keys
Set NTPD
root@ubuntu17:~# apt-get install ntp root@ubuntu17:~# update-rc.d ntp defaults
Set the host FQDN in my case the Ambari server vi /etc/hosts ---- IP--FDN--ALIAS
192.168.0.172 ubuntu17.kenya.com ubuntu17
Edit host name
root@ubuntu17:~# vi /etc/hostname ubuntu17.kenya.com
Firewall settings
root@ubuntu17:~# sudo ufw disable The program 'setenforce' is currently not installed.So I installed
The program 'setenforce' is currently not installed.So I installed
root@ubuntu17:~# apt install selinux-utils
Disable SE-Linux
root@ubuntu17:~# setenforce 0
Set UMASK
root@ubuntu17:~# umask 0022 root@ubuntu17:~# echo umask 0022 >> /etc/profile
root@ubuntu17:~# wget -O /etc/apt/sources.list.d/ambari.list http://public-repo-1.hortonworks.com/ambari/ubuntu16/2.x/updates/2.5.2.0/ambari.list
Grab the key
root@ubuntu17:~# apt-key adv --recv-keys --keyserver keyserver.ubuntu.com B9733A7A07513CAD
Update
root@ubuntu17:~# apt-get update
root@ubuntu17:~# apt-cache showpkg ambari-server root@ubuntu17:~# apt-cache showpkg ambari-agent root@ubuntu17:~# apt-cache showpkg ambari-metrics-assembly
Install Java Openjdk comes with JCE
root@ubuntu17:~# sudo apt-get install openjdk-8-jdk
Install Ambari server here I used the root user
root@ubuntu17:~# apt-get install ambari-server root@ubuntu17:~# ambari-server setup
Setup as root user with default postgres databases
root@ubuntu17:~# ambari-server start Ambari Server 'start' completed successfully.
Get the FQDN for the ambari server
root@ubuntu17:~# hostname -f ubuntu17.kenya.com
So my Ambari URL will be http://ubuntu17.kenya.com:8080
See attached screenshot,got an error with THP so had to disable
root@ubuntu17:~/.ssh# cat /sys/kernel/mm/transparent_hugepage/enabled [always] madvise never root@ubuntu17:~/.ssh# echo never > /sys/kernel/mm/transparent_hugepage/enabled root@ubuntu17:~/.ssh# echo never > /sys/kernel/mm/transparent_hugepage/defrag root@ubuntu17:~/.ssh# cat /sys/kernel/mm/transparent_hugepage/enabled always madvise [never]
Setup Mysql database for hive ,oozie etc
root@ubuntu17:~# sudo apt-get update root@ubuntu17:~# sudo apt-get install mysql-server root@ubuntu17:~# sudo mysql_secure_installation root@ubuntu17:~# apt-get install -y libpostgresql-jdbc-java root@ubuntu17:~# apt-get install libmysql-java root@ubuntu17:~# ls /usr/share/java/mysql-connector-java.jar
Check that the mysql connector is present at that location
root@ubuntu17:~# ls /usr/share/java/mysql-connector-java.jar
Create the hive,oozie and ranger databases in Mysql with the correct privileges
created database for hive successfully
created database for oozie successfully
created database for ranger successfully
Run this command
# ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar
Proceeded with the cluster installation
http://ubuntu17.kenya.com:8080
See attached screenshots
Created 10-15-2017 09:06 PM
subsequent screenshots
Created 10-16-2017 06:07 AM
i am waiting to see the outcome because most of the steps i made are exactly the same
exceptions:
Created 10-17-2017 09:10 AM
while one of my post is being moderated ill ask a another question:
why did you set a custom sql for hive ,oozie etc