Member since
11-25-2019
3
Posts
0
Kudos Received
0
Solutions
01-05-2020
08:40 PM
I'm sorry for not answering earlier, only now got access to the environment again. So I did solution one, and the agent worked, but it went down in a while which I couldn't realize why, the logs didn't say anything. But when it was up, I tried to start services there through Ambari, and it looks like it doesn't initiate a request, since there are no logs about the start action of that service. Then, I thought let's clean everything and start from fresh, deleted ambari-agent, cleaned all folders as mentioned and installed it again. The same issue is there when I try to start services, For example this is the log of ambari-agent when I tried to start the SNameNode, which is hosted on the problematic node. INFO 2020-01-06 07:35:39,472 __init__.py:82 - Event from server at /user/commands: {u'clusters': {u'2': {u'commands': [{u'commandParams': '...', u'clusterId': u'2', u'clusterName': u'dev', u'commandType': u'EXECUTION_COMMAND', u'roleCommand': u'START', u'serviceName': u'HDFS', u'role': u'SECONDARY_NAMENODE', u'requestId': 424, u'taskId': 7353, u'repositoryFile': '...', u'componentVersionMap': {u'HDFS': {u'SECONDARY_NAMENODE': u'3.1.0.0-78', u'JOURNALNODE': u'3.1.0.0-78', u'HDFS_CLIENT': u'3.1.0.0-78', u'DATANODE': u'3.1.0.0-78', u'NAMENODE': u'3.1.0.0-78', u'NFS_GATEWAY': u'3.1.0.0-78', u'ZKFC': u'3.1.0.0-78'}, u'ZOOKEEPER': {u'ZOOKEEPER_SERVER': u'3.1.0.0-78', u'ZOOKEEPER_CLIENT': u'3.1.0.0-78'}, u'SPARK2': {u'SPARK2_THRIFTSERVER': u'3.1.0.0-78', u'SPARK2_CLIENT': u'3.1.0.0-78', u'LIVY2_SERVER': u'3.1.0.0-78', u'SPARK2_JOBHISTORYSERVER': u'3.1.0.0-78'}, u'SQOOP': {u'SQOOP': u'3.1.0.0-78'}, u'HIVE': {u'HIVE_SERVER': u'3.1.0.0-78', u'HIVE_METASTORE': u'3.1.0.0-78', u'HIVE_SERVER_INTERACTIVE': u'3.1.0.0-78', u'HIVE_CLIENT': u'3.1.0.0-78'}, u'YARN': {u'YARN_REGISTRY_DNS': u'3.1.0.0-78', u'RESOURCEMANAGER': u'3.1.0.0-78', u'YARN_CLIENT': u'3.1.0.0-78', u'TIMELINE_READER': u'3.1.0.0-78', u'APP_TIMELINE_SERVER': u'3.1.0.0-78', u'NODEMANAGER': u'3.1.0.0-78'}, u'PIG': {u'PIG': u'3.1.0.0-78'}, u'RANGER': {u'RANGER_TAGSYNC': u'3.1.0.0-78', u'RANGER_ADMIN': u'3.1.0.0-78', u'RANGER_USERSYNC': u'3.1.0.0-78'}, u'TEZ': {u'TEZ_CLIENT': u'3.1.0.0-78'}, u'MAPREDUCE2': {u'MAPREDUCE2_CLIENT': u'3.1.0.0-78', u'HISTORYSERVER': u'3.1.0.0-78'}, u'ZEPPELIN': {u'ZEPPELIN_MASTER': u'3.1.0.0-78'}, u'HBASE': {u'HBASE_MASTER': u'3.1.0.0-78', u'PHOENIX_QUERY_SERVER': u'3.1.0.0-78', u'HBASE_CLIENT': u'3.1.0.0-78', u'HBASE_REGIONSERVER': u'3.1.0.0-78'}, u'KAFKA': {u'KAFKA_BROKER': u'3.1.0.0-78'}, u'KNOX': {u'KNOX_GATEWAY': u'3.1.0.0-78'}, u'RANGER_KMS': {u'RANGER_KMS_SERVER': u'3.1.0.0-78'}}, u'commandId': u'424-0'}]}}, u'requiredConfigTimestamp': 1578284883431}
INFO 2020-01-06 07:35:39,473 ActionQueue.py:79 - Adding EXECUTION_COMMAND for role SECONDARY_NAMENODE for service HDFS of cluster_id 2 to the queue
INFO 2020-01-06 07:35:39,473 security.py:135 - Event to server at /reports/responses (correlation_id=66): {'status': 'OK', 'messageId': '2'}
INFO 2020-01-06 07:35:39,475 __init__.py:82 - Event from server at /user/ (correlation_id=66): {u'status': u'OK'}
INFO 2020-01-06 07:35:40,595 security.py:135 - Event to server at /heartbeat (correlation_id=67): {'id': 44}
INFO 2020-01-06 07:35:40,597 __init__.py:82 - Event from server at /user/ (correlation_id=67): {u'status': u'OK', u'id': 45}
INFO 2020-01-06 07:35:42,317 ComponentStatusExecutor.py:107 - Skipping status command for INFRA_SOLR. Since command for it is running Also, there are no logs under /var/log/hadoop/hdfs which makes me think that ambari-agent on the problematic node didn't actually initiate the call. I'm going to mark your answer as acceptable since it has solved the issue I originally talked about, do you think I should create a new post for this?
... View more