Created 10-10-2017 09:49 AM
im based on http://public-repo-1.hortonworks.com/ambari/ubuntu16/2.x/updates/2.5.2.0/ambari.listr
selected spark2 and all its required dependencies
the following services have an error:
i receive the following error on manual starting history server
INFO 2017-10-10 04:57:10,565 logger.py:75 - Testing the JVM's JCE policy to see it if supports an unlimited key length. INFO 2017-10-10 04:57:10,565 logger.py:75 - Testing the JVM's JCE policy to see it if supports an unlimited key length. INFO 2017-10-10 04:57:10,681 Hardware.py:176 - Some mount points were ignored: /dev, /run, /, /dev/shm, /run/lock, /sys/fs/cgroup, /boot, /home, /run/user/108, /run/user/1007, /run/user/1005, /run/user/1010, /run/user/1011, /run/user/1012, /run/user/1001 INFO 2017-10-10 04:57:10,682 Controller.py:320 - Sending Heartbeat (id = 4066) INFO 2017-10-10 04:57:10,688 Controller.py:333 - Heartbeat response received (id = 4067) INFO 2017-10-10 04:57:10,688 Controller.py:342 - Heartbeat interval is 1 seconds INFO 2017-10-10 04:57:10,688 Controller.py:380 - Updating configurations from heartbeat INFO 2017-10-10 04:57:10,688 Controller.py:389 - Adding cancel/execution commands INFO 2017-10-10 04:57:10,688 Controller.py:475 - Waiting 0.9 for next heartbeat INFO 2017-10-10 04:57:11,589 Controller.py:482 - Wait for next heartbeat over WARNING 2017-10-10 04:57:22,205 base_alert.py:138 - [Alert][namenode_hdfs_capacity_utilization] Unable to execute alert. division by zero INFO 2017-10-10 04:57:27,060 ClusterConfiguration.py:119 - Updating cached configurations for cluster vqcluster INFO 2017-10-10 04:57:27,071 Controller.py:249 - Adding 1 commands. Heartbeat id = 4085 INFO 2017-10-10 04:57:27,071 ActionQueue.py:113 - Adding EXECUTION_COMMAND for role SPARK2_JOBHISTORYSERVER for service SPARK2 of cluster vqcluster to the queue. INFO 2017-10-10 04:57:27,081 ActionQueue.py:238 - Executing command with id = 68-0, taskId = 307 for role = SPARK2_JOBHISTORYSERVER of cluster vqcluster. INFO 2017-10-10 04:57:27,081 ActionQueue.py:279 - Command execution metadata - taskId = 307, retry enabled = False, max retry duration (sec) = 0, log_output = True WARNING 2017-10-10 04:57:27,083 CommandStatusDict.py:128 - [Errno 2] No such file or directory: '/var/lib/ambari-agent/data/output-307.txt' INFO 2017-10-10 04:57:32,563 PythonExecutor.py:130 - Command ['/usr/bin/python', u'/var/lib/ambari-agent/cache/common-services/SPARK2/2.0.0/package/scripts/job_history_server.py', u'START', '/var/lib/ambari-agent/data/command-307.json', u'/var/lib/ambari-agent/cache/common-services/SPARK2/2.0.0/package', '/var/lib/ambari-agent/data/structured-out-307.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1', ''] failed with exitcode=1 INFO 2017-10-10 04:57:32,577 log_process_information.py:40 - Command 'export COLUMNS=9999 ; ps faux' returned 0. USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
Created 10-15-2017 07:03 AM
Sorry to hear you are encountering all these problems. Could you tell me the
HDP,Ambari and OS type and version you are trying to install.
I will try to guide you.
Created 10-16-2017 09:06 AM
Here we go all the services up and running without major issues !!!
Did you install java with JCE? standard procedure? I have seen your comments but at times we ignore the obvious well-documented steps. I followed usual steps and all is up and running.
As I have demonstrated that the procedure is OK and valid, this answered your question.
Created 10-16-2017 09:21 AM
Ill try again by your step exactly. and hope for the best
Created 10-16-2017 10:38 AM
I am positive it will work! And once it does don't forget to accept my answer. That way other HCC users can quickly find the solution when they encounter the same issue.
Please let me know
Created on 10-17-2017 12:26 PM - edited 08-17-2019 07:54 PM
i have installed all services on one node and had few issue:
after finishing the install stage
Hive Metastore fail to start with same issue as before:
INFO 2017-10-17 01:55:04,537 RecoveryManager.py:255 - SPARK2_THRIFTSERVER needs recovery, desired = STARTED, and current = INSTALLED. INFO 2017-10-17 01:55:04,537 RecoveryManager.py:255 - SPARK_THRIFTSERVER needs recovery, desired = STARTED, and current = INSTALLED. INFO 2017-10-17 01:55:04,538 RecoveryManager.py:255 - HIVE_METASTORE needs recovery, desired = STARTED, and current = INSTALLED. INFO 2017-10-17 01:55:09,219 ClusterConfiguration.py:119 - Updating cached configurations for cluster vqcluster INFO 2017-10-17 01:55:09,252 RecoveryManager.py:717 - Received EXECUTION_COMMAND (START), desired state of HIVE_METASTORE to STARTED INFO 2017-10-17 01:55:09,253 Controller.py:249 - Adding 1 commands. Heartbeat id = 62015 INFO 2017-10-17 01:55:09,253 ActionQueue.py:113 - Adding EXECUTION_COMMAND for role HIVE_METASTORE for service HIVE of cluster vqcluster to the queue. INFO 2017-10-17 01:55:09,288 ActionQueue.py:238 - Executing command with id = 43-0, taskId = 276 for role = HIVE_METASTORE of cluster vqcluster. INFO 2017-10-17 01:55:09,288 ActionQueue.py:279 - Command execution metadata - taskId = 276, retry enabled = False, max retry duration (sec) = 0, log_output = True INFO 2017-10-17 01:55:09,289 CustomServiceOrchestrator.py:265 - Generating the JCEKS file: roleCommand=START and taskId = 276 INFO 2017-10-17 01:55:09,289 CustomServiceOrchestrator.py:243 - Identifying config hive-site for CS: INFO 2017-10-17 01:55:09,289 CustomServiceOrchestrator.py:288 - provider_path=jceks://file/var/lib/ambari-agent/cred/conf/hive/hive-site.jceks INFO 2017-10-17 01:55:09,289 CustomServiceOrchestrator.py:295 - ('/usr/jdk64/jdk1.8.0_112/bin/java', '-cp', '/var/lib/ambari-agent/cred/lib/*', 'org.apache.hadoop.security.alias.CredentialShell', 'create', u'javax.jdo.option.ConnectionPassword', '-value', [PROTECTED], '-provider', 'jceks://file/var/lib/ambari-agent/cred/conf/hive/hive-site.jceks') WARNING 2017-10-17 01:55:09,318 CommandStatusDict.py:128 - [Errno 2] No such file or directory: '/var/lib/ambari-agent/data/output-276.txt' INFO 2017-10-17 01:55:09,694 CustomServiceOrchestrator.py:297 - cmd_result = 0 INFO 2017-10-17 01:55:14,801 RecoveryManager.py:255 - SPARK2_THRIFTSERVER needs recovery, desired = STARTED, and current = INSTALLED. INFO 2017-10-17 01:55:14,801 RecoveryManager.py:255 - SPARK_THRIFTSERVER needs recovery, desired = STARTED, and current = INSTALLED. INFO 2017-10-17 01:55:14,801 RecoveryManager.py:255 - HIVE_METASTORE needs recovery, desired = STARTED, and current = INSTALLED. INFO 2017-10-17 01:55:15,375 PythonExecutor.py:130 - Command ['/usr/bin/python', u'/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/hive_metastore.py', u'START', '/var/lib/ambari-agent/data/command-276.json', u'/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package', '/var/lib/ambari-agent/data/structured-out-276.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1', ''] failed with exitcode=1
and Spark2 Thrift Server, Spark(1) Thrift Server can start but keep going down,
ERROR 2017-10-17 01:57:36,029 script_alert.py:123 - [Alert][spark_thriftserver_status] Failed with result CRITICAL: ['Connection failed on host ambari-master.test.com:10015 (Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/SPARK/1.2.1/package/scripts/alerts/alert_spark_thrift_port.py", line 143, in execute Execute(cmd, user=hiveruser, path=[beeline_cmd], timeout=CHECK_COMMAND_TIMEOUT_DEFAULT) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 166, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 262, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call raise ExecutionFailed(err_msg, code, out, err) ExecutionFailed: Execution of \'! beeline -u \'jdbc:hive2://ambari-master.test.com:10015/default\' transportMode=binary -e \'\' 2>&1| awk \'{print}\'|grep -i -e \'Connection refused\' -e \'Invalid URL\'\' returned 1. Error: Could not open client transport with JDBC Uri: jdbc:hive2://ambari-master.test.com:10015/default: java.net.ConnectException: Connection refused (Connection refused) (state=08S01,code=0) Error: Could not open client transport with JDBC Uri: jdbc:hive2://ambari-master.test.com:10015/default: java.net.ConnectException: Connection refused (Connection refused) (state=08S01,code=0) )']
our main requirement is spark2 so ill try to make another clean install with only spark2 and its dependencies hoping not to have any more issues
Created 10-17-2017 09:33 AM
Hive needs a metastore database to store structure information of the various tables and partitions in the warehouse.
Oozie stores the workflow/scheduler details in the relational database
You could use postgres instead of Mysql and the creation is even easier see below
Hive database
In the below example the database and user is "hive"
# su - postgres postgres@ubuntu17:~$ psql psql (9.5.9) Type "help" for help. postgres=# DROP DATABASE if exists hive; postgres=# CREATE USER hive PASSWORD 'hive'; postgres=# CREATE DATABASE hive OWNER hive; postgres=# grant all privileges on database hive to hive; postgres=# \q
Oozie database
In the below example the database and user is "oozie"
postgres=# DROP DATABASE if exists oozie; postgres=# CREATE USER oozie PASSWORD 'oozie'; postgres=# CREATE DATABASE oozie OWNER oozie; postgres=# grant all privileges on database oozie to oozie; postgres=# \q
After the above has been successful during the Ambari UI setup use the hive and oozie info to set up these components. Can you let me know if the installation succeeded
Created 10-17-2017 10:44 AM
the installation partialy succeeded, i have a large response waiting for moderation(probably due to size or attached image)
ill ask the question again, why setting ambari-server setup -s is not enough, why it is required to configure the sql manually
Created 10-17-2017 02:36 PM
Is this the correct URL to you hive database? jdbc:hive2://ambari-master.test.com:10015/default
I see error "Connection refused\' -e \'Invalid URL\'\' returned 1. Error"
Can you walk me through the setup of the databases for hive,oozie? Was it with Mysql or Postgres?
ambari-server setup -s ( -s silent install) should work with the embedded postgres, so I can tell why it didn't work in your case could one of the standard OS preparations that you ignored.
Created 10-17-2017 02:44 PM
the url is current, (i have changed it in the comment but the url is current. cause all other systems are working)
do i have do make some special configuraion to the sql db ?
i have done:
ambari-server setup -s sudo apt-get update
sudo apt-get install mysql-server -y
sudo mysql_secure_installation
apt-get install libpostgresql-jdbc-java -y
apt-get install libmysql-java -y
ls /usr/share/java/mysql-connector-java.jar
ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar
and then i tried to make the cluster via the web ui, all goes with no error until starting hive(last stage)
Created 10-17-2017 02:51 PM
Can you enlighten me I see a mixture of Mysql and postgres command in your new posting. We can resolve your issue with ONLY postgres installation because the mixture looks confusing.
It wont affect any service except Hive and oozie or Ranger iif you intend to and Ranger for authorization, authentication and administration of security policies
Created 10-17-2017 03:01 PM
i tried multiple setups the only one that worked(parietal) when i installed all available services with Metastore issue (but spark2 worked i have successfully added more nodes and tested with sparksubmit )
that why i might have some mixture with sql and pgsql, i tried any combination i could think of
then i tried to install only spark2 with its dependencies using the following steps:
apt-get install ntp -y
update-rc.d ntp defaults
sudo ufw disable
apt install selinux-utils -y
setenforce 0
umask 0022
echo umask 0022 >> /etc/profile
wget -O /etc/apt/sources.list.d/ambari.list http://public-repo-1.hortonworks.com/ambari/ubuntu16/2.x/updates/2.5.2.0/ambari.list
apt-key adv --recv-keys --keyserver keyserver.ubuntu.com B9733A7A07513CAD
apt-get update
apt-get install ambari-server -y
echo never > /sys/kernel/mm/transparent_hugepage/enabled echo never > /sys/kernel/mm/transparent_hugepage/defrag
ambari-server setup -s ambari-server start
in the web ui i selected only spark2 service and then clicked yes on all its dependencies
this give me an error on hive-metastore start stage