Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

ambari-agent connection refused after host reboot

Highlighted

ambari-agent connection refused after host reboot

New Contributor

I have a 1-node ambari managed cluster which was working correctly, auto starting server, host and components when restarting the system.


I've changed to a public ip and a diferent hostname and after solving FQDN problems, I have problem with auto start when rebooting. Ambari server and agent are auto starting but heartbeat is lost and ambari-agent logs show connection refused, but if I manually restart ambari-agent, connection is correct and I can start services.


There's ambari-server UI just after rebooting.

108906-1558694191895.png


ambari-agent log shows the nex tail.

ERROR 2019-05-24 12:28:47,235 script_alert.py:119 - [Alert][hive_metastore_process] Failed with result CRITICAL: ['Metastore on bigdata.es failed (Traceback (most recent call last):\n  File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/alerts/alert_hive_metastore.py", line 200, in execute\n    timeout_kill_strategy=TerminateStrategy.KILL_PROCESS_TREE,\n  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__\n    self.env.run()\n  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run\n    self.run_action(resource, action)\n  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action\n    provider_action()\n  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 262, in action_run\n    tries=self.resource.tries, try_sleep=self.resource.try_sleep)\n  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner\n    result = function(command, **kwargs)\n  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call\n    tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy)\n  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper\n    result = _call(command, **kwargs_copy)\n  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call\n    raise ExecutionFailed(err_msg, code, out, err)\nExecutionFailed: Execution of \'export HIVE_CONF_DIR=\'/usr/hdp/current/hive-metastore/conf/conf.server\' ; hive --hiveconf hive.metastore.uris=thrift://bigdata.es:9083                 --hiveconf hive.metastore.client.connect.retry.delay=1                 --hiveconf hive.metastore.failure.retries=1                 --hiveconf hive.metastore.connect.retries=1                 --hiveconf hive.metastore.client.socket.timeout=14                 --hiveconf hive.execution.engine=mr -e \'show databases;\'\' returned 1. Logging initialized using configuration in file:/etc/hive/2.6.3.0-71/0/conf.server/hive-log4j.properties\nException in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient\n\tat org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:547)\n\tat org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)\n\tat org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat org.apache.hadoop.util.RunJar.run(RunJar.java:233)\n\tat org.apache.hadoop.util.RunJar.main(RunJar.java:148)\nCaused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient\n\tat org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1566)\n\tat org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:92)\n\tat org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:138)\n\tat org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:110)\n\tat org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3510)\n\tat org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3542)\n\tat org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:528)\n\t... 8 more\nCaused by: java.lang.reflect.InvocationTargetException\n\tat sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)\n\tat sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)\n\tat sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)\n\tat java.lang.reflect.Constructor.newInstance(Constructor.java:423)\n\tat org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1564)\n\t... 14 more\nCaused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Conexi\xc3\xb3n rehusada (Connection refused)\n\tat org.apache.thrift.transport.TSocket.open(TSocket.java:226)\n\tat org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:487)\n\tat org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:282)\n\tat org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:76)\n\tat sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)\n\tat sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)\n\tat sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)\n\tat java.lang.reflect.Constructor.newInstance(Constructor.java:423)\n\tat org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1564)\n\tat org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:92)\n\tat org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:138)\n\tat org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:110)\n\tat org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3510)\n\tat org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3542)\n\tat org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:528)\n\tat org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)\n\tat org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat org.apache.hadoop.util.RunJar.run(RunJar.java:233)\n\tat org.apache.hadoop.util.RunJar.main(RunJar.java:148)\nCaused by: java.net.ConnectException: Conexi\xc3\xb3n rehusada (Connection refused)\n\tat java.net.PlainSocketImpl.socketConnect(Native Method)\n\tat java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)\n\tat java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)\n\tat java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)\n\tat java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)\n\tat java.net.Socket.connect(Socket.java:589)\n\tat org.apache.thrift.transport.TSocket.open(TSocket.java:221)\n\t... 22 more\n)\n\tat org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:534)\n\tat org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:282)\n\tat org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:76)\n\t... 19 more\n)']
INFO 2019-05-24 12:28:58,095 logger.py:71 - call[['test', '-w', '/dev']] {'sudo': True, 'quiet': False, 'timeout': 5}
INFO 2019-05-24 12:28:58,108 logger.py:71 - call returned (0, '')
INFO 2019-05-24 12:28:58,119 logger.py:71 - call[['test', '-w', '/']] {'sudo': True, 'quiet': False, 'timeout': 5}
INFO 2019-05-24 12:28:58,131 logger.py:71 - call returned (0, '')
INFO 2019-05-24 12:28:58,143 logger.py:71 - call[['test', '-w', '/Datos']] {'sudo': True, 'quiet': False, 'timeout': 5}
INFO 2019-05-24 12:28:58,154 logger.py:71 - call returned (0, '')
ERROR 2019-05-24 12:29:39,995 script_alert.py:119 - [Alert][hive_webhcat_server_status] Failed with result CRITICAL: ['Connection failed to http://bigdata.es:50111/templeton/v1/status?user.name=ambari-qa + \nTraceback (most recent call last):\n  File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/alerts/alert_webhcat_server.py", line 190, in execute\n    url_response = urllib2.urlopen(query_url, timeout=connection_timeout)\n  File "/usr/lib/python2.7/urllib2.py", line 154, in urlopen\n    return opener.open(url, data, timeout)\n  File "/usr/lib/python2.7/urllib2.py", line 429, in open\n    response = self._open(req, data)\n  File "/usr/lib/python2.7/urllib2.py", line 447, in _open\n    \'_open\', req)\n  File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain\n    result = func(*args)\n  File "/usr/lib/python2.7/urllib2.py", line 1228, in http_open\n    return self.do_open(httplib.HTTPConnection, req)\n  File "/usr/lib/python2.7/urllib2.py", line 1198, in do_open\n    raise URLError(err)\nURLError: <urlopen error [Errno 111] Conexi\xc3\xb3n rehusada>\n']
ERROR 2019-05-24 12:29:40,000 script_alert.py:119 - [Alert][yarn_nodemanager_health] Failed with result CRITICAL: ['Connection failed to http://bigdata.es:8042/ws/v1/node/info (Traceback (most recent call last):\n  File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanager_health.py", line 171, in execute\n    url_response = urllib2.urlopen(query, timeout=connection_timeout)\n  File "/usr/lib/python2.7/urllib2.py", line 154, in urlopen\n    return opener.open(url, data, timeout)\n  File "/usr/lib/python2.7/urllib2.py", line 429, in open\n    response = self._open(req, data)\n  File "/usr/lib/python2.7/urllib2.py", line 447, in _open\n    \'_open\', req)\n  File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain\n    result = func(*args)\n  File "/usr/lib/python2.7/urllib2.py", line 1228, in http_open\n    return self.do_open(httplib.HTTPConnection, req)\n  File "/usr/lib/python2.7/urllib2.py", line 1198, in do_open\n    raise URLError(err)\nURLError: <urlopen error [Errno 111] Conexi\xc3\xb3n rehusada>\n)']
INFO 2019-05-24 13:30:00,155 main.py:96 - loglevel=logging.INFO
INFO 2019-05-24 13:30:00,157 main.py:96 - loglevel=logging.INFO
INFO 2019-05-24 13:30:00,157 main.py:96 - loglevel=logging.INFO
INFO 2019-05-24 13:30:00,159 DataCleaner.py:39 - Data cleanup thread started
INFO 2019-05-24 13:30:00,164 DataCleaner.py:120 - Data cleanup started
INFO 2019-05-24 13:30:00,295 DataCleaner.py:122 - Data cleanup finished
INFO 2019-05-24 13:30:00,314 PingPortListener.py:50 - Ping port listener started on port: 8670
INFO 2019-05-24 13:30:00,314 main.py:132 - Newloglevel=logging.DEBUG
INFO 2019-05-24 13:30:00,314 main.py:405 - Connecting to Ambari server at https://bigdata.es:8440 (10.61.2.10)
DEBUG 2019-05-24 13:30:00,314 NetUtil.py:110 - Trying to connect to https://bigdata.es:8440
INFO 2019-05-24 13:30:00,315 NetUtil.py:67 - Connecting to https://bigdata.es:8440/ca
WARNING 2019-05-24 13:30:00,317 NetUtil.py:98 - Failed to connect to https://bigdata.es:8440/ca due to [Errno 111] Conexión rehusada
WARNING 2019-05-24 13:30:00,317 NetUtil.py:121 - Server at https://bigdata.es:8440 is not reachable, sleeping for 10 seconds...


And ambari-agent.ini config is the following

[server]
hostname = bigdata.es
url_port = 8440
secured_url_port = 8441
connect_retry_delay = 10
max_reconnect_retry_delay = 30

[agent]
logdir = /var/log/ambari-agent
piddir = /var/run/ambari-agent
prefix = /var/lib/ambari-agent/data
loglevel = DEBUG
data_cleanup_interval = 86400
data_cleanup_max_age = 2592000
data_cleanup_max_size_mb = 100
ping_port = 8670
cache_dir = /var/lib/ambari-agent/cache
tolerate_download_failures = true
run_as_user = root
parallel_execution = 0
alert_grace_period = 5
status_command_timeout = 5
alert_kinit_timeout = 14400000
system_resource_overrides = /etc/resource_overrides

[security]
keysdir = /var/lib/ambari-agent/keys
server_crt = ca.crt
passphrase_env_var_name = AMBARI_PASSPHRASE
ssl_verify_cert = 0
credential_lib_dir = /var/lib/ambari-agent/cred/lib
credential_conf_dir = /var/lib/ambari-agent/cred/conf
credential_shell_cmd = org.apache.hadoop.security.alias.CredentialShell
force_https_protocol = PROTOCOL_TLSv1_2

[services]
pidlookuppath = /var/run/

[heartbeat]
state_interval_seconds = 60
dirs = /etc/hadoop,/etc/hadoop/conf,/etc/hbase,/etc/hcatalog,/etc/hive,/etc/oozie,
        /etc/sqoop,
        /var/run/hadoop,/var/run/zookeeper,/var/run/hbase,/var/run/templeton,/var/run/oozie,
        /var/log/hadoop,/var/log/zookeeper,/var/log/hbase,/var/run/templeton,/var/log/hive
log_lines_count = 300
idle_interval_min = 1
idle_interval_max = 10

[logging]
syslog_enabled = 0


If I run sudo ambari-agent restart command, then I can connect and start services in ambari-server ui.

108943-1558694501380.png

14 REPLIES 14

Re: ambari-agent connection refused after host reboot

Mentor

@Adrián Gil

Changing the hostname /etc/hosts is not enough there are the below step you MUST perform to have your Cluster working again. In Ambari Web > Dashboard, stop all services.

Stop ambari-server and ambari-agents on all hosts.

# ambari-server stop
# ambari-agent stop

Create a file host_names_changes.json file with hostnames changes.

Contents of the host_names_changes.json

{
  "bigdata" : {
      "old_name_here" : "bigdata-es"
  }
}

From the directory where you saved the above file as a root user run

# ambari-server update-host-names host_names_changes.json


After successful execution, your ambari will update all the necessary files except the old hostnames in ambari.properties change the below values


Change these 3 values in /etc/ambari-server/conf/ambari.properties

server.jdbc.rca.url=
server.jdbc.url=jdbc=
server.jdbc.hostname=

You will need to change also values for hive,oozie etc change according to your installation

# Hive

grant all privileges on hive.* to 'hive'@'bigdata-es' identified by 'hive_password';
grant all privileges on hive.* to 'hive'@'bigdata-es' with grant option;

# Oozie

grant all privileges on oozie.* to 'hive'@'bigdata-es' identified by 'oozie_password';
grant all privileges on oozie.* to 'hive'@'bigdata-es' with grant option;

# Ranger

grant all privileges on ranger.* to 'hive'@'bigdata-es' identified by 'ranger_password';
grant all privileges on ranger.* to 'hive'@'bigdata-es' with grant option;

# Rangerkms

grant all privileges on rangerkms.* to 'hive'@'bigdata-es' identified by 'rangerkms_password';
grant all privileges on rangerkms.* to 'hive'@'bigdata-es' with grant option;


After doing the above now you can restart amabri and the ambari-agent hoping you have changed to the name in ambari-agent.ini,

Start ambari-server and ambari-agents on all hosts.

# ambari-server start
# ambari-agent start

Before running the Ambari component startall make sure hive,oozie,ranger etc have the correct hostname in the Ambari UI Configs !!

Happy Hadooping


Re: ambari-agent connection refused after host reboot

New Contributor

Thanks for your explained answer @Geoffrey Shelton Okot


I've tried to do so and changed all manually changeable values you mentioned (including setting same hostname as it had before ip change)


If I try this last option, I can make it correctly run with 'ambari-agent restart' but it won't in system 'sudo reboot'.

When trying a different hostanem and making changes you told me, I got an error when

  1. ambari-server update-host-names host_names_changes.json

And ambari-server.log shows folloging trace:

  1. 27 may 2019 09:59:38,771 ERROR [main] AmbariJpaLocalTxnInterceptor:180 - [DETAILED ERROR] Rollback reason: Local Exception Stack: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.6.2.v20151217-774c696): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "uq_hosts_host_name"   Detail: Key (host_name)=(bigdata.es) already exists. Error Code: 0 Call: UPDATE hosts SET host_name = ? WHERE (host_id = ?)         bind => [2 parameters bound]         at org.eclipse.persistence.exceptions.DatabaseException.sqlException(DatabaseException.java:340)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.processExceptionForCommError(DatabaseAccessor.java:1620)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeDirectNoSelect(DatabaseAccessor.java:900)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeNoSelect(DatabaseAccessor.java:964)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.basicExecuteCall(DatabaseAccess

Re: ambari-agent connection refused after host reboot

Community Manager

The above question and the reply thread below were originally posted in the Community Help Track. On Fri May 24 09:14 PDT 2019, a member of the HCC moderation staff moved it to the Cloud & Operations Track. The Community Help Track is intended for questions about using the HCC site itself.

Re: ambari-agent connection refused after host reboot

New Contributor

Thanks for your explained answer @Geoffrey Shelton Okot


I've tried to do so and changed all manually changeable values you mentioned (including setting same hostname as it had before ip change)


If I try this last option, I can make it correctly run with 'ambari-agent restart' but it won't in system 'sudo reboot'.

When trying a different hostanem and making changes you told me, I got an error when

 ambari-server update-host-names host_names_changes.json 

And ambari-server.log shows folloging trace:

27 may 2019 09:59:38,771 ERROR [main] AmbariJpaLocalTxnInterceptor:180 - [DETAILED ERROR] Rollback reason: Local Exception Stack: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.6.2.v20151217-774c696): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "uq_hosts_host_name"   Detail: Key (host_name)=(bigdata.es) already exists. Error Code: 0 Call: UPDATE hosts SET host_name = ? WHERE (host_id = ?)         bind => [2 parameters bound]         at org.eclipse.persistence.exceptions.DatabaseException.sqlException(DatabaseException.java:340)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.processExceptionForCommError(DatabaseAccessor.java:1620)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeDirectNoSelect(DatabaseAccessor.java:900)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeNoSelect(DatabaseAccessor.java:964)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.basicExecuteCall(DatabaseAccessor.java:633)         at org.eclipse.persistence.internal.databaseaccess.ParameterizedSQLBatchWritingMechanism.executeBatch(ParameterizedSQLBatchWritingMechanism.java:149)         at org.eclipse.persistence.internal.databaseaccess.ParameterizedSQLBatchWritingMechanism.executeBatchedStatements(ParameterizedSQLBatchWritingMechanism.java:134)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.writesCompleted(DatabaseAccessor.java:1845)         at org.eclipse.persistence.internal.sessions.AbstractSession.writesCompleted(AbstractSession.java:4300)         at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.writesCompleted(UnitOfWorkImpl.java:5592)         at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.acquireWriteLocks(UnitOfWorkImpl.java:1646)         at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.commitTransactionAfterWriteChanges(UnitOfWorkImpl.java:1614)         at org.eclipse.persistence.internal.sessions.RepeatableWriteUnitOfWork.commitRootUnitOfWork(RepeatableWriteUnitOfWork.java:285)         at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.commitAndResume(UnitOfWorkImpl.java:1169)         at org.eclipse.persistence.internal.jpa.transaction.EntityTransactionImpl.commit(EntityTransactionImpl.java:134)         at org.apache.ambari.server.orm.AmbariJpaLocalTxnInterceptor.invoke(AmbariJpaLocalTxnInterceptor.java:153)         at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:72)         at com.google.inject.internal.InterceptorStackCallback.intercept(InterceptorStackCallback.java:52)         at org.apache.ambari.server.orm.dao.HostDAO$$EnhancerByGuice$$3e1957eb.merge(<generated>)         at org.apache.ambari.server.update.HostUpdateHelper.updateHostsInDB(HostUpdateHelper.java:407)         at org.apache.ambari.server.update.HostUpdateHelper.main(HostUpdateHelper.java:548) Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "uq_hosts_host_name"   Detail: Key (host_name)=(bigdata.es) already exists.         at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)         at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1890)         at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)         at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:559)         at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)         at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:363)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeDirectNoSelect(DatabaseAccessor.java:892)         ... 18 more 27 may 2019 09:59:38,781 ERROR [main] AmbariJpaLocalTxnInterceptor:188 - [DETAILED ERROR] Internal exception (1) : org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "uq_hosts_host_name"   Detail: Key (host_name)=(bigdata.es) already exists.         at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)         at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1890)         at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)         at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:559)         at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)         at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:363)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeDirectNoSelect(DatabaseAccessor.java:892)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeNoSelect(DatabaseAccessor.java:964)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.basicExecuteCall(DatabaseAccessor.java:633)         at org.eclipse.persistence.internal.databaseaccess.ParameterizedSQLBatchWritingMechanism.executeBatch(ParameterizedSQLBatchWritingMechanism.java:149)         at org.eclipse.persistence.internal.databaseaccess.ParameterizedSQLBatchWritingMechanism.executeBatchedStatements(ParameterizedSQLBatchWritingMechanism.java:134)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.writesCompleted(DatabaseAccessor.java:1845)         at org.eclipse.persistence.internal.sessions.AbstractSession.writesCompleted(AbstractSession.java:4300)         at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.writesCompleted(UnitOfWorkImpl.java:5592)         at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.acquireWriteLocks(UnitOfWorkImpl.java:1646)         at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.commitTransactionAfterWriteChanges(UnitOfWorkImpl.java:1614)         at org.eclipse.persistence.internal.sessions.RepeatableWriteUnitOfWork.commitRootUnitOfWork(RepeatableWriteUnitOfWork.java:285)         at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.commitAndResume(UnitOfWorkImpl.java:1169)         at org.eclipse.persistence.internal.jpa.transaction.EntityTransactionImpl.commit(EntityTransactionImpl.java:134)         at org.apache.ambari.server.orm.AmbariJpaLocalTxnInterceptor.invoke(AmbariJpaLocalTxnInterceptor.java:153)         at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:72)         at com.google.inject.internal.InterceptorStackCallback.intercept(InterceptorStackCallback.java:52)         at org.apache.ambari.server.orm.dao.HostDAO$$EnhancerByGuice$$3e1957eb.merge(<generated>)         at org.apache.ambari.server.update.HostUpdateHelper.updateHostsInDB(HostUpdateHelper.java:407)         at org.apache.ambari.server.update.HostUpdateHelper.main(HostUpdateHelper.java:548) 27 may 2019 09:59:38,781 ERROR [main] HostUpdateHelper:564 - Unexpected error, host names update failed javax.persistence.RollbackException: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.6.2.v20151217-774c696): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "uq_hosts_host_name"   Detail: Key (host_name)=(bigdata.es) already exists. Error Code: 0 Call: UPDATE hosts SET host_name = ? WHERE (host_id = ?)         bind => [2 parameters bound]         at org.eclipse.persistence.internal.jpa.transaction.EntityTransactionImpl.commit(EntityTransactionImpl.java:159)         at org.apache.ambari.server.orm.AmbariJpaLocalTxnInterceptor.invoke(AmbariJpaLocalTxnInterceptor.java:153)         at org.apache.ambari.server.update.HostUpdateHelper.updateHostsInDB(HostUpdateHelper.java:407)         at org.apache.ambari.server.update.HostUpdateHelper.main(HostUpdateHelper.java:548) Caused by: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.6.2.v20151217-774c696): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "uq_hosts_host_name"   Detail: Key (host_name)=(bigdata.es) already exists. Error Code: 0 Call: UPDATE hosts SET host_name = ? WHERE (host_id = ?)         bind => [2 parameters bound]         at org.eclipse.persistence.exceptions.DatabaseException.sqlException(DatabaseException.java:340)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.processExceptionForCommError(DatabaseAccessor.java:1620)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeDirectNoSelect(DatabaseAccessor.java:900)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeNoSelect(DatabaseAccessor.java:964)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.basicExecuteCall(DatabaseAccessor.java:633)         at org.eclipse.persistence.internal.databaseaccess.ParameterizedSQLBatchWritingMechanism.executeBatch(ParameterizedSQLBatchWritingMechanism.java:149)         at org.eclipse.persistence.internal.databaseaccess.ParameterizedSQLBatchWritingMechanism.executeBatchedStatements(ParameterizedSQLBatchWritingMechanism.java:134)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.writesCompleted(DatabaseAccessor.java:1845)         at org.eclipse.persistence.internal.sessions.AbstractSession.writesCompleted(AbstractSession.java:4300)         at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.writesCompleted(UnitOfWorkImpl.java:5592)         at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.acquireWriteLocks(UnitOfWorkImpl.java:1646)         at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.commitTransactionAfterWriteChanges(UnitOfWorkImpl.java:1614)         at org.eclipse.persistence.internal.sessions.RepeatableWriteUnitOfWork.commitRootUnitOfWork(RepeatableWriteUnitOfWork.java:285)         at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.commitAndResume(UnitOfWorkImpl.java:1169)         at org.eclipse.persistence.internal.jpa.transaction.EntityTransactionImpl.commit(EntityTransactionImpl.java:134)         ... 3 more Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "uq_hosts_host_name"   Detail: Key (host_name)=(bigdata.es) already exists.         at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)         at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1890)         at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)         at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:559)         at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)         at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:363)         at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeDirectNoSelect(DatabaseAccessor.java:892)

Re: ambari-agent connection refused after host reboot

Mentor

@Adrián Gil

What values are in your json file?

Can you share a redacted version of the /etc/hosts

and the output of

$ hostname -f 

The below shows you are changing to a value which is no different from the old value :-)

Detail: Key (host_name)=(bigdata.es) already exists.
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)         
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1890)         
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255) 


Please revert


Re: ambari-agent connection refused after host reboot

Mentor

@Adrián Gil

What values do you have in /etc/host?

Can you share a redacted version of your /etc/hosts and the output of


$ hostname -f

The below shows that you are trying to change to a value which is no different from the older value :-)

Detail: Key (host_name)=(bigdata.es) already exists.
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)         
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1890)         
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255) 

ERROR: duplicate key value violates unique constraint "uq_hosts_host_name"   
Detail: Key (host_name)=(bigdata.es) already exists. Error Code: 0 
Call: UPDATE hosts SET host_name = ? WHERE (host_id = ?)bind => [2 parameters bound]

Please revert


Re: ambari-agent connection refused after host reboot

New Contributor

Geoffrey Shelton Okot


There you have hostname -f and /etc/hosts outputs.

bigdata@bigdata:~$ hostname -f
bigdatapruebas.es
bigdata@bigdata:~$ sudo cat /etc/hosts
10.61.2.10              bigdatapruebas.es bigdata.adurizenergia.es bigdata.es bigdata

::1                     localhost ip6-localhost ip6-loopback
ff02::1                 ip6-allnodes
ff02::02                ip6-allrouters
bigdata@bigdata:~$

Also script is the following one.

{
  "bigdata" : {
      "bigdata.es" : "bigdatapruebas.es"
  }
}


It did finnaly succeded, but as it's shown on the following picture, heartbeat is still lost after sudo reboot the system. I changed values from different try to make sure names are completly different.

108915-1558972173890.png

After sudo ambari-agent restart command, it's recognized again and we can see stopped status and not heartbeat lost one.

108924-1558972006404.png

Re: ambari-agent connection refused after host reboot

Mentor

@Adrián Gil

You succeeded but encountering heartbeat lost because your /etc/hosts entry is wrong :-) I really can't understand how you can connect :-) The below entry is wrong and shouldn't resolve that somehow explains why you had difficulty in running the host_names_changes.json

bigdata@bigdata:~$ sudo cat /etc/hosts
10.61.2.10              bigdatapruebas.es bigdata.adurizenergia.es bigdata.es bigdata

Should be the /etc/host entry should be exactly the output of

$ hostname -f
bigdatapruebas.es

So your /etc/hosts should look like below

$ sudo cat /etc/hosts
10.61.2.10       bigdatapruebas.es 

But if you had an FQDN like bigdata.endesa.es the entry could be IP FQDN ALIAS

$ sudo cat /etc/hosts
IP                FQDN                  ALIAS
------------------------------------------------
10.61.2.10       bigdata.endesa.es      bigdata

With the above entry in the /etc/hosts, you can access ambari successfully in 2 ways

http://bigdata.endesa.es:8080

and

http://bigdata:8080


Please do the necessary changes and revert





Re: ambari-agent connection refused after host reboot

New Contributor

@Geoffrey Shelton Okot


Already done /etc/hosts modifications and with this result:

  1. bigdata@bigdata:~$ cat /etc/hosts
  2. IP                FQDN                  ALIAS
  3. ------------------------------------------------
  4. 10.61.2.10       bigdatapruebas.es      bigdata.es
  5. #10.61.2.10             bigdatapruebas.es #bigdata.es bigdata bigdata.pruebasenergia.es
  6.  
  7. ::1                     localhost ip6-localhost ip6-loopback
  8. ff02::1                 ip6-allnodes
  9. ff02::02                ip6-allrouters
  10.  
  11. bigdata@bigdata:~$ hotname -f No se ha encontrado la orden «hotname», quizás quiso decir:
  12.  La orden «hostname» del paquete «hostname» (main) hotname: no se encontró la orden
  13. bigdata@bigdata:~$ hostname -f
  14. bigdatapruebas.es
  15. bigdata@bigdata:~$


But happens the same thing. When applying changes it works but not working if rebooting until I manually restart agent :-(