Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HDP 3.1 installation error

avatar
Contributor

while installing HDP 3.1 Cluster at stage 9 Install Start & Test.. showing below error

 

Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-374.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-374.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
2019-09-23 16:54:26,245 - Reporting component version failed

=================================================

DataNode Install
stderr: /var/lib/ambari-agent/data/errors-423.txt

Command aborted. Reason: 'Server considered task failed and automatically aborted it'
stdout: /var/lib/ambari-agent/data/output-423.txt

Command aborted. Reason: 'Server considered task failed and automatically aborted it'

Command failed after 1 tries

1 ACCEPTED SOLUTION

avatar
Master Mentor

@irfangk1 Looks like the SmartSense service installation is failing for you.

 

Failed to execute command: dpkg-query -l | grep 'ii\s*smartsense-*' || apt-get -o Dpkg::Options::=--force-confdef --allow-unauthenticated --assume-yes install smartsense-hst || dpkg -i /var/lib/ambari-agent/cache/stacks/HDP/3.0/services/SMARTSENSE/package/files/deb/*.deb; Exit code: 1; stdout: ; stderr: E: dpkg was interrupted, you must manually run 'dpkg --configure -a' to correct the problem.
dpkg: error processing archive /var/lib/ambari-agent/cache/stacks/HDP/3.0/services/SMARTSENSE/package/files/deb/*.deb (--install):
cannot access archive: No such file or directory

 


Thats the reason later at Step9 the services are failing to start because smartsense service binaries (due to package installation failure) are not present.

 

INFO 2019-09-24 16:06:03,584 ComponentStatusExecutor.py:172 - Status command for HST_AGENT failed:
Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout: ; stderr: /bin/sh: 1: /usr/sbin/hst: not found

 


Better to skip and proceed the Step9 by clicking Next/Proceed/OK/Complete kind of button in UI and then later verify on the Host where the SmartSense package is supposed to be install and failing to install.

 

Check if the repo is fine and if you are manually able to install the SmartSesne binary on that host?

It might be some ambari repo access issue on that node where the SmartSense installation failed.
Better to check and share the exact OS version and the ambari repo version.

 

Like based on the ambari version please get the correct repo something like:

# wget -O /etc/apt/sources.list.d/ambari.list <a href="http://public-repo-1.hortonworks.com/ambari/XXXXXXXXXXXXXX/updates/2.7.3.0/ambari.list" target="_blank">http://public-repo-1.hortonworks.com/ambari/XXXXXXXXXXXXXX/updates/2.7.3.0/ambari.list</a>
# apt-get clean all
# apt-get update

You can get the correct ambari REPO URL from the following kind of links:
https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/bk_ambari-upgrade-major/content/upgrade_ambari...

.

.

View solution in original post

4 REPLIES 4

avatar
Master Mentor

@irfangk1 

Can you please share the files:

/var/lib/ambari-agent/data/output-423.txt
/var/lib/ambari-agent/data/errors-423.txt

 

And if possible then the ambari-agent.log from the cluster  node where the above files are generated. You can find the host details where the operation failed by looking at the ambari UI operational logs.

avatar
Contributor

Failed to execute command: dpkg-query -l | grep 'ii\s*smartsense-*' || apt-get -o Dpkg::Options::=--force-confdef --allow-unauthenticated --assume-yes install smartsense-hst || dpkg -i /var/lib/ambari-agent/cache/stacks/HDP/3.0/services/SMARTSENSE/package/files/deb/*.deb; Exit code: 1; stdout: ; stderr: E: dpkg was interrupted, you must manually run 'dpkg --configure -a' to correct the problem.
dpkg: error processing archive /var/lib/ambari-agent/cache/stacks/HDP/3.0/services/SMARTSENSE/package/files/deb/*.deb (--install):
cannot access archive: No such file or directory
Errors were encountered while processing:
/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/SMARTSENSE/package/files/deb/*.deb

 

============================================================================================
Std Out: None
Std Err: E: dpkg was interrupted, you must manually run 'dpkg --configure -a' to correct the problem.
dpkg: error processing archive /var/lib/ambari-agent/cache/stacks/HDP/3.0/services/SMARTSENSE/package/files/deb/*.deb (--install):
cannot access archive: No such file or directory
Errors were encountered while processing:
/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/SMARTSENSE/package/files/deb/*.deb

2019-09-24 15:56:37,060 - Skipping stack-select on SMARTSENSE because it does not exist in the stack-select package structure.

 

avatar
Contributor

Please check below logs of servers & Nodes


**** Server LOGS Below

2019-09-24 15:56:39,122 INFO [agent-message-monitor-0] MessageEmitter:218 - Schedule execution command emitting, retry: 0, messageId: 353
2019-09-24 15:56:39,122 INFO [agent-message-monitor-0] MessageEmitter:218 - Schedule execution command emitting, retry: 0, messageId: 338
2019-09-24 15:56:39,123 WARN [agent-message-retry-0] MessageEmitter:255 - Reschedule execution command emitting, retry: 1, messageId: 320
2019-09-24 15:56:39,123 WARN [agent-message-retry-0] MessageEmitter:255 - Reschedule execution command emitting, retry: 1, messageId: 353
2019-09-24 15:56:39,123 WARN [agent-message-retry-0] MessageEmitter:255 - Reschedule execution command emitting, retry: 1, messageId: 338
2019-09-24 15:56:39,322 INFO [agent-message-monitor-0] MessageEmitter:218 - Schedule execution command emitting, retry: 0, messageId: 321
2019-09-24 15:56:39,323 INFO [agent-message-monitor-0] MessageEmitter:218 - Schedule execution command emitting, retry: 0, messageId: 354
2019-09-24 15:56:39,323 INFO [agent-message-monitor-0] MessageEmitter:218 - Schedule execution command emitting, retry: 0, messageId: 339
2019-09-24 15:56:39,323 WARN [agent-message-retry-0] MessageEmitter:255 - Reschedule execution command emitting, retry: 1, messageId: 321
2019-09-24 15:56:39,323 WARN [agent-message-retry-0] MessageEmitter:255 - Reschedule execution command emitting, retry: 1, messageId: 354
2019-09-24 15:56:39,323 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 339
2019-09-24 15:56:39,523 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 322
2019-09-24 15:56:39,523 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 355
2019-09-24 15:56:39,523 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 340
2019-09-24 15:56:39,523 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 322
2019-09-24 15:56:39,523 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 355
2019-09-24 15:56:39,523 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 340
2019-09-24 15:56:39,723 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 323
2019-09-24 15:56:39,723 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 356
2019-09-24 15:56:39,723 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 341
2019-09-24 15:56:39,724 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 323
2019-09-24 15:56:39,724 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 356
2019-09-24 15:56:39,724 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 341
2019-09-24 15:56:39,923 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 324
2019-09-24 15:56:39,924 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 357
2019-09-24 15:56:39,924 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 342
2019-09-24 15:56:39,924 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 324
2019-09-24 15:56:39,924 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 357
2019-09-24 15:56:39,924 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 342
2019-09-24 15:56:40,124 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 325
2019-09-24 15:56:40,124 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 358
2019-09-24 15:56:40,124 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 343
2019-09-24 15:56:40,124 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 325
2019-09-24 15:56:40,125 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 358
2019-09-24 15:56:40,125 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 343
2019-09-24 15:56:40,324 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 326
2019-09-24 15:56:40,324 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 344
2019-09-24 15:56:40,326 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 326
2019-09-24 15:56:40,326 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 344
2019-09-24 15:56:40,524 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 327
2019-09-24 15:56:40,524 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 345
2019-09-24 15:56:40,525 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 327
2019-09-24 15:56:40,525 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 345
2019-09-24 15:56:40,725 INFO [agent-message-monitor-0] MessageEmitter:218 - Schedule execution command emitting, retry: 0, messageId: 328
2019-09-24 15:56:40,725 INFO [agent-message-monitor-0] MessageEmitter:218 - Schedule execution command emitting, retry: 0, messageId: 346
2019-09-24 15:56:40,725 WARN [agent-message-retry-0] MessageEmitter:255 - Reschedule execution command emitting, retry: 1, messageId: 328
2019-09-24 15:56:40,725 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 346
2019-09-24 15:56:40,925 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 329
2019-09-24 15:56:40,925 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 347
2019-09-24 15:56:40,925 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 347
2019-09-24 15:56:40,925 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 329
2019-09-24 15:56:41,072 WARN [ambari-client-thread-36] TaskResourceProvider:271 - Unable to parse task structured output: /var/lib/ambari-agent/data/structured -out-543.json
2019-09-24 15:56:41,073 WARN [ambari-client-thread-36] TaskResourceProvider:271 - Unable to parse task structured output: "{}"
2019-09-24 15:56:41,125 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 330
2019-09-24 15:56:41,126 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 330
2019-09-24 15:56:41,191 INFO [ambari-client-thread-32] MetricsCollectorHAManage r:63 - Adding collector host : inairsr542007v4.ntil.com to cluster : Ness_Test
2019-09-24 15:56:41,193 INFO [ambari-client-thread-32] MetricsCollectorHACluste rState:81 - Refreshing collector host, current collector host : inairsr542007v4. ntil.com
2019-09-24 15:56:41,193 INFO [ambari-client-thread-32] MetricsCollectorHACluste rState:102 - After refresh, new collector host : inairsr542007v4.ntil.com
2019-09-24 15:56:41,325 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 331
2019-09-24 15:56:41,326 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 331
2019-09-24 15:56:41,504 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_webui which is a definition that does not exist in cluster id=2
2019-09-24 15:56:41,505 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_hdfs_pending_deletion_blocks which is a definition that does not exist in cluster id=2
2019-09-24 15:56:41,505 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_hdfs_blocks_health which is a definition that does not exist in cluster id=2
2019-09-24 15:56:41,505 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_hdfs_capacity_utilization which is a definition tha t does not exist in cluster id=2
2019-09-24 15:56:41,506 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_ha_health which is a definition that does not exist in cluster id=2
2019-09-24 15:56:41,506 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_rpc_latency which is a definition that does not exi st in cluster id=2
2019-09-24 15:56:41,506 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for grafana_webui which is a definition that does not exist in c luster id=2
2019-09-24 15:56:41,506 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for smartsense_bundle_failed_or_timedout which is a definition t hat does not exist in cluster id=2
2019-09-24 15:56:41,506 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for smartsense_server_process which is a definition that does no t exist in cluster id=2
2019-09-24 15:56:41,507 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for yarn_resourcemanager_webui which is a definition that does n ot exist in cluster id=2
2019-09-24 15:56:41,507 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for upgrade_finalized_state which is a definition that does not exist in cluster id=2
2019-09-24 15:56:41,507 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for yarn_timeline_reader_webui which is a definition that does n ot exist in cluster id=2
2019-09-24 15:56:41,507 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for smartsense_gateway_status which is a definition that does no t exist in cluster id=2
2019-09-24 15:56:41,507 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for YARN_REGISTRY_DNS_PROCESS which is a definition that does no t exist in cluster id=2
2019-09-24 15:56:41,508 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for nodemanager_health_summary which is a definition that does n ot exist in cluster id=2
2019-09-24 15:56:41,508 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for ambari_agent_ulimit which is a definition that does not exis t in cluster id=2
2019-09-24 15:56:41,509 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for zookeeper_server_process which is a definition that does not exist in cluster id=2
2019-09-24 15:56:41,509 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_last_checkpoint which is a definition that does not exist in cluster id=2
2019-09-24 15:56:41,509 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for datanode_health_summary which is a definition that does not exist in cluster id=2
2019-09-24 15:56:41,509 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for ambari_agent_disk_usage which is a definition that does not exist in cluster id=2
2019-09-24 15:56:41,510 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for SPARK2_JOBHISTORYSERVER_PROCESS which is a definition that d oes not exist in cluster id=2
2019-09-24 15:56:41,510 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for ams_metrics_monitor_process which is a definition that does not exist in cluster id=2
2019-09-24 15:56:41,510 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_directory_status which is a definition that does no t exist in cluster id=2
2019-09-24 15:56:41,526 INFO [agent-message-monitor-0] MessageEmitter:218 - Sch edule execution command emitting, retry: 0, messageId: 332
2019-09-24 15:56:41,526 WARN [agent-message-retry-0] MessageEmitter:255 - Resch edule execution command emitting, retry: 1, messageId: 332
2019-09-24 15:56:54,869 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for hive_server_process which is a definition that does not exis t in cluster id=2
2019-09-24 15:56:54,870 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for sys db status which is a definition that does not exist in c luster id=2
2019-09-24 15:56:54,870 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for yarn_app_timeline_server_webui which is a definition that do es not exist in cluster id=2
2019-09-24 15:56:54,871 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for datanode_process which is a definition that does not exist i n cluster id=2
2019-09-24 15:56:54,871 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for ambari_agent_disk_usage which is a definition that does not exist in cluster id=2
2019-09-24 15:56:54,871 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for secondary_namenode_process which is a definition that does n ot exist in cluster id=2
2019-09-24 15:56:54,872 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for ambari_agent_ulimit which is a definition that does not exis t in cluster id=2
2019-09-24 15:56:54,872 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for spark2_thriftserver_status which is a definition that does n ot exist in cluster id=2
2019-09-24 15:56:54,872 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for datanode_webui which is a definition that does not exist in cluster id=2
2019-09-24 15:56:54,872 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for hive_metastore_process which is a definition that does not e xist in cluster id=2
2019-09-24 15:56:54,873 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for livy2_server_status which is a definition that does not exis t in cluster id=2
2019-09-24 15:56:54,873 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for mapreduce_history_server_webui which is a definition that do es not exist in cluster id=2
2019-09-24 15:57:05,769 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for ams_metrics_collector_process which is a definition that doe s not exist in cluster id=2
2019-09-24 15:57:05,769 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for ams_metrics_collector_autostart which is a definition that d oes not exist in cluster id=2
2019-09-24 15:57:05,769 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for ambari_agent_disk_usage which is a definition that does not exist in cluster id=2
2019-09-24 15:57:05,770 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for ams_metrics_monitor_process which is a definition that does not exist in cluster id=2
2019-09-24 15:57:05,770 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for ambari_agent_ulimit which is a definition that does not exis t in cluster id=2
2019-09-24 15:57:05,770 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for datanode_webui which is a definition that does not exist in cluster id=2
2019-09-24 15:57:05,770 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for yarn_nodemanager_webui which is a definition that does not e xist in cluster id=2
2019-09-24 15:57:05,770 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for yarn_nodemanager_health which is a definition that does not exist in cluster id=2
2019-09-24 15:57:05,771 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for datanode_process which is a definition that does not exist i n cluster id=2
2019-09-24 15:57:05,771 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for ams_metrics_collector_hbase_master_process which is a defini tion that does not exist in cluster id=2
2019-09-24 15:57:36,519 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for ambari_agent_ulimit which is a definition that does not exis t in cluster id=2
2019-09-24 15:57:36,520 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for ambari_agent_disk_usage which is a definition that does not exist in cluster id=2
2019-09-24 15:57:41,521 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for namenode_last_checkpoint which is a definition that does not exist in cluster id=2
2019-09-24 15:57:41,522 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for namenode_webui which is a definition that does not exist in cluster id=2
2019-09-24 15:57:41,522 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for datanode_health_summary which is a definition that does not exist in cluster id=2
2019-09-24 15:57:41,522 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for yarn_timeline_reader_webui which is a definition that does n ot exist in cluster id=2
2019-09-24 15:57:41,522 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for upgrade_finalized_state which is a definition that does not exist in cluster id=2
2019-09-24 15:57:41,523 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for yarn_resourcemanager_webui which is a definition that does n ot exist in cluster id=2
2019-09-24 15:57:41,523 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for YARN_REGISTRY_DNS_PROCESS which is a definition that does no t exist in cluster id=2
2019-09-24 15:57:41,523 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for namenode_ha_health which is a definition that does not exist in cluster id=2
2019-09-24 15:57:41,523 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for nodemanager_health_summary which is a definition that does n ot exist in cluster id=2
2019-09-24 15:57:41,523 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for smartsense_gateway_status which is a definition that does no t exist in cluster id=2
2019-09-24 15:57:41,524 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for zookeeper_server_process which is a definition that does not exist in cluster id=2
2019-09-24 15:57:41,524 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for grafana_webui which is a definition that does not exist in c luster id=2
2019-09-24 15:57:41,524 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for ams_metrics_monitor_process which is a definition that does not exist in cluster id=2
2019-09-24 15:57:41,524 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for SPARK2_JOBHISTORYSERVER_PROCESS which is a definition that d oes not exist in cluster id=2
2019-09-24 15:57:41,524 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for smartsense_server_process which is a definition that does no t exist in cluster id=2
2019-09-24 15:57:41,525 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for namenode_directory_status which is a definition that does no t exist in cluster id=2
2019-09-24 15:57:54,880 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for datanode_unmounted_data_dir which is a definition that does not exist in cluster id=2
2019-09-24 15:57:54,881 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for datanode_storage which is a definition that does not exist i n cluster id=2
2019-09-24 15:57:54,881 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for datanode_heap_usage which is a definition that does not exis t in cluster id=2
2019-09-24 15:58:05,775 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for datanode_unmounted_data_dir which is a definition that does not exist in cluster id=2
2019-09-24 15:58:05,775 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for datanode_storage which is a definition that does not exist i n cluster id=2
2019-09-24 15:58:05,775 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for datanode_heap_usage which is a definition that does not exis t in cluster id=2
2019-09-24 15:58:36,537 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for ambari_agent_version_select which is a definition that does not exist in cluster id=2
2019-09-24 15:58:41,539 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_hdfs_blocks_health which is a definition that does not exist in cluster id=2
2019-09-24 15:58:41,539 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_hdfs_capacity_utilization which is a definition tha t does not exist in cluster id=2
2019-09-24 15:58:41,540 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_rpc_latency which is a definition that does not exi st in cluster id=2
2019-09-24 15:58:41,540 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for ats_hbase which is a definition that does not exist in clust er id=2
2019-09-24 15:58:41,540 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for smartsense_bundle_failed_or_timedout which is a definition t hat does not exist in cluster id=2
2019-09-24 15:58:41,540 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_hdfs_pending_deletion_blocks which is a definition that does not exist in cluster id=2
2019-09-24 15:58:41,540 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_cpu which is a definition that does not exist in cl uster id=2
2019-09-24 15:58:41,541 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_client_rpc_processing_latency_hourly which is a def inition that does not exist in cluster id=2
2019-09-24 15:58:41,541 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for smartsense_long_running_bundle which is a definition that do es not exist in cluster id=2
2019-09-24 15:58:41,541 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_service_rpc_processing_latency_hourly which is a de finition that does not exist in cluster id=2
2019-09-24 15:58:41,541 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_service_rpc_queue_latency_hourly which is a definit ion that does not exist in cluster id=2
2019-09-24 15:58:41,541 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for namenode_client_rpc_queue_latency_hourly which is a definiti on that does not exist in cluster id=2
2019-09-24 15:58:41,542 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for yarn_resourcemanager_rpc_latency which is a definition that does not exist in cluster id=2
2019-09-24 15:58:41,542 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for yarn_resourcemanager_cpu which is a definition that does not exist in cluster id=2
2019-09-24 15:58:54,882 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for ambari_agent_version_select which is a definition that does not exist in cluster id=2
2019-09-24 15:58:54,882 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for mapreduce_history_server_cpu which is a definition that does not exist in cluster id=2
2019-09-24 15:58:54,882 WARN [alert-event-bus-2] AlertReceivedListener:172 - Re ceived an alert for mapreduce_history_server_rpc_latency which is a definition t hat does not exist in cluster id=2
2019-09-24 15:59:05,778 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for ambari_agent_version_select which is a definition that does not exist in cluster id=2
2019-09-24 15:59:05,778 WARN [alert-event-bus-1] AlertReceivedListener:172 - Re ceived an alert for ams_metrics_collector_hbase_master_cpu which is a definition that does not exist in cluster id=2
2019-09-24 16:06:05,889 INFO [pool-30-thread-1] AmbariMetricSinkImpl:291 - No live collector to send metrics to. Metrics to be sent will be discarded. This mes sage will be skipped for the next 20 times.

 

** Node1 LOGS Below

Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout: ; stderr: /bin/sh: 1: /usr/sbin/hst: not found
INFO 2019-09-24 16:05:23,684 security.py:135 - Event to server at /heartbeat (correlation_id=307): {'id': 198}
INFO 2019-09-24 16:05:23,687 __init__.py:82 - Event from server at /user/ (correlation_id=307): {u'status': u'OK', u'id': 199}
INFO 2019-09-24 16:05:33,689 security.py:135 - Event to server at /heartbeat (correlation_id=308): {'id': 199}
INFO 2019-09-24 16:05:33,692 __init__.py:82 - Event from server at /user/ (correlation_id=308): {u'status': u'OK', u'id': 200}
INFO 2019-09-24 16:05:39,757 ComponentStatusExecutor.py:172 - Status command for HST_AGENT failed:
Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout: ; stderr: /bin/sh: 1: /usr/sbin/hst: not found
INFO 2019-09-24 16:05:43,695 security.py:135 - Event to server at /heartbeat (correlation_id=309): {'id': 200}
INFO 2019-09-24 16:05:43,698 __init__.py:82 - Event from server at /user/ (correlation_id=309): {u'status': u'OK', u'id': 201}
INFO 2019-09-24 16:05:53,700 security.py:135 - Event to server at /heartbeat (correlation_id=310): {'id': 201}
INFO 2019-09-24 16:05:53,703 __init__.py:82 - Event from server at /user/ (correlation_id=310): {u'status': u'OK', u'id': 202}
INFO 2019-09-24 16:06:03,584 ComponentStatusExecutor.py:172 - Status command for HST_AGENT failed:
Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout: ; stderr: /bin/sh: 1: /usr/sbin/hst: not found
INFO 2019-09-24 16:06:03,705 security.py:135 - Event to server at /heartbeat (correlation_id=311): {'id': 202}
INFO 2019-09-24 16:06:03,709 __init__.py:82 - Event from server at /user/ (correlation_id=311): {u'status': u'OK', u'id': 203}
INFO 2019-09-24 16:06:07,283 Hardware.py:188 - Some mount points were ignored: /dev, /run, /dev/shm, /run/lock, /sys/fs/cgroup, /run/user/108, /run/user/0
INFO 2019-09-24 16:06:07,284 security.py:135 - Event to server at /reports/host_status (correlation_id=312): {'agentEnv': {'transparentHugePage': 'madvise', 'hostHealth': {'agentTimeStampAtReporting': 1569321367271, 'liveServices': [{'status': 'Healthy', 'name': 'ntp or chrony', 'desc': ''}]}, 'reverseLookup': True, 'umask': '18', 'hasUnlimitedJcePolicy': False, 'alternatives': [], 'firewallName': 'ufw', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': False}, 'mounts': [{'available': '91312304', 'used': '5572676', 'percent': '6%', 'device': '/dev/sda1', 'mountpoint': '/', 'type': 'ext4', 'size': '102094168'}]}
INFO 2019-09-24 16:06:07,288 __init__.py:82 - Event from server at /user/ (correlation_id=312): {u'status': u'OK'}
INFO 2019-09-24 16:06:13,712 security.py:135 - Event to server at /heartbeat (correlation_id=313): {'id': 203}
INFO 2019-09-24 16:06:13,714 __init__.py:82 - Event from server at /user/ (correlation_id=313): {u'status': u'OK', u'id': 204}
INFO 2019-09-24 16:06:23,717 security.py:135 - Event to server at /heartbeat (correlation_id=314): {'id': 204}
INFO 2019-09-24 16:06:23,720 __init__.py:82 - Event from server at /user/ (correlation_id=314): {u'status': u'OK', u'id': 205}
INFO 2019-09-24 16:06:27,627 ComponentStatusExecutor.py:172 - Status command for HST_AGENT failed:
Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout: ; stderr: /bin/sh: 1: /usr/sbin/hst: not found
INFO 2019-09-24 16:06:33,722 security.py:135 - Event to server at /heartbeat (correlation_id=315): {'id': 205}
INFO 2019-09-24 16:06:33,725 __init__.py:82 - Event from server at /user/ (correlation_id=315): {u'status': u'OK', u'id': 206}
INFO 2019-09-24 16:06:43,727 security.py:135 - Event to server at /heartbeat (correlation_id=316): {'id': 206}
INFO 2019-09-24 16:06:43,730 __init__.py:82 - Event from server at /user/ (correlation_id=316): {u'status': u'OK', u'id': 207}
INFO 2019-09-24 16:06:51,526 ComponentStatusExecutor.py:172 - Status command for HST_AGENT failed:
Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout: ; stderr: /bin/sh: 1: /usr/sbin/hst: not found
INFO 2019-09-24 16:06:53,732 security.py:135 - Event to server at /heartbeat (correlation_id=317): {'id': 207}
INFO 2019-09-24 16:06:53,735 __init__.py:82 - Event from server at /user/ (correlation_id=317): {u'status': u'OK', u'id': 208}
INFO 2019-09-24 16:07:03,738 security.py:135 - Event to server at /heartbeat (correlation_id=318): {'id': 208}
INFO 2019-09-24 16:07:03,741 __init__.py:82 - Event from server at /user/ (correlation_id=318): {u'status': u'OK', u'id': 209}
INFO 2019-09-24 16:07:07,509 Hardware.py:188 - Some mount points were ignored: /dev, /run, /dev/shm, /run/lock, /sys/fs/cgroup, /run/user/108, /run/user/0
INFO 2019-09-24 16:07:07,510 security.py:135 - Event to server at /reports/host_status (correlation_id=319): {'agentEnv': {'transparentHugePage': 'madvise', 'hostHealth': {'agentTimeStampAtReporting': 1569321427496, 'liveServices': [{'status': 'Healthy', 'name': 'ntp or chrony', 'desc': ''}]}, 'reverseLookup': True, 'umask': '18', 'hasUnlimitedJcePolicy': False, 'alternatives': [], 'firewallName': 'ufw', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': False}, 'mounts': [{'available': '91312288', 'used': '5572692', 'percent': '6%', 'device': '/dev/sda1', 'mountpoint': '/', 'type': 'ext4', 'size': '102094168'}]}
INFO 2019-09-24 16:07:07,514 __init__.py:82 - Event from server at /user/ (correlation_id=319): {u'status': u'OK'}
INFO 2019-09-24 16:07:13,743 security.py:135 - Event to server at /heartbeat (correlation_id=320): {'id': 209}
INFO 2019-09-24 16:07:13,749 __init__.py:82 - Event from server at /user/ (correlation_id=320): {u'status': u'OK', u'id': 210}
INFO 2019-09-24 16:07:15,385 ComponentStatusExecutor.py:172 - Status command for HST_AGENT failed:
Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout: ; stderr: /bin/sh: 1: /usr/sbin/hst: not found
INFO 2019-09-24 16:07:23,768 security.py:135 - Event to server at /heartbeat (correlation_id=321): {'id': 210}
INFO 2019-09-24 16:07:23,771 __init__.py:82 - Event from server at /user/ (correlation_id=321): {u'status': u'OK', u'id': 211}

 

** ** Node2 LOGS Below

lation_id=327): {u'status': u'OK', u'id': 194}
INFO 2019-09-24 16:04:43,604 security.py:135 - Event to server at /heartbeat (correlation_id=328): {'id': 194}
INFO 2019-09-24 16:04:43,607 __init__.py:82 - Event from server at /user/ (correlation_id=328): {u'status': u'OK', u'id': 195}
INFO 2019-09-24 16:04:52,690 ComponentStatusExecutor.py:172 - Status command forHST_AGENT failed:
Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout:; stderr: /bin/sh: 1: /usr/sbin/hst: not found
INFO 2019-09-24 16:04:53,610 security.py:135 - Event to server at /heartbeat (correlation_id=329): {'id': 195}
INFO 2019-09-24 16:04:53,617 __init__.py:82 - Event from server at /user/ (correlation_id=329): {u'status': u'OK', u'id': 196}
INFO 2019-09-24 16:05:03,619 security.py:135 - Event to server at /heartbeat (correlation_id=330): {'id': 196}
INFO 2019-09-24 16:05:03,622 __init__.py:82 - Event from server at /user/ (correlation_id=330): {u'status': u'OK', u'id': 197}
INFO 2019-09-24 16:05:13,624 security.py:135 - Event to server at /heartbeat (correlation_id=331): {'id': 197}
INFO 2019-09-24 16:05:13,627 __init__.py:82 - Event from server at /user/ (correlation_id=331): {u'status': u'OK', u'id': 198}
INFO 2019-09-24 16:05:15,616 ComponentStatusExecutor.py:172 - Status command for HST_AGENT failed:
Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout: ; stderr: /bin/sh: 1: /usr/sbin/hst: not found
INFO 2019-09-24 16:05:21,399 Hardware.py:188 - Some mount points were ignored: /dev, /run, /dev/shm, /run/lock, /sys/fs/cgroup, /run/user/108, /run/user/0
INFO 2019-09-24 16:05:21,400 security.py:135 - Event to server at /reports/host_status (correlation_id=332): {'agentEnv': {'transparentHugePage': 'madvise', 'ho stHealth': {'agentTimeStampAtReporting': 1569321321386, 'liveServices': [{'statu s': 'Healthy', 'name': 'ntp or chrony', 'desc': ''}]}, 'reverseLookup': True, 'u mask': '18', 'hasUnlimitedJcePolicy': False, 'alternatives': [], 'firewallName': 'ufw', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': Fals e}, 'mounts': [{'available': '89429964', 'used': '7455016', 'percent': '8%', 'de vice': '/dev/sda1', 'mountpoint': '/', 'type': 'ext4', 'size': '102094168'}]}
INFO 2019-09-24 16:05:21,404 __init__.py:82 - Event from server at /user/ (correlation_id=332): {u'status': u'OK'}
INFO 2019-09-24 16:05:23,629 security.py:135 - Event to server at /heartbeat (correlation_id=333): {'id': 198}
INFO 2019-09-24 16:05:23,632 __init__.py:82 - Event from server at /user/ (correlation_id=333): {u'status': u'OK', u'id': 199}
INFO 2019-09-24 16:05:33,633 security.py:135 - Event to server at /heartbeat (correlation_id=334): {'id': 199}
INFO 2019-09-24 16:05:33,636 __init__.py:82 - Event from server at /user/ (correlation_id=334): {u'status': u'OK', u'id': 200}
INFO 2019-09-24 16:05:38,394 ComponentStatusExecutor.py:172 - Status command forHST_AGENT failed:
Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout: ; stderr: /bin/sh: 1: /usr/sbin/hst: not found
INFO 2019-09-24 16:05:43,638 security.py:135 - Event to server at /heartbeat (correlation_id=335): {'id': 200}
INFO 2019-09-24 16:05:43,642 __init__.py:82 - Event from server at /user/ (correlation_id=335): {u'status': u'OK', u'id': 201}
INFO 2019-09-24 16:05:53,643 security.py:135 - Event to server at /heartbeat (correlation_id=336): {'id': 201}
INFO 2019-09-24 16:05:53,646 __init__.py:82 - Event from server at /user/ (correlation_id=336): {u'status': u'OK', u'id': 202}
INFO 2019-09-24 16:06:01,282 ComponentStatusExecutor.py:172 - Status command for HST_AGENT failed:
Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout:; stderr: /bin/sh: 1: /usr/sbin/hst: not found
INFO 2019-09-24 16:06:03,648 security.py:135 - Event to server at /heartbeat (correlation_id=337): {'id': 202}
INFO 2019-09-24 16:06:03,651 __init__.py:82 - Event from server at /user/ (correlation_id=337): {u'status': u'OK', u'id': 203}
INFO 2019-09-24 16:06:13,653 security.py:135 - Event to server at /heartbeat (correlation_id=338): {'id': 203}
INFO 2019-09-24 16:06:13,656 __init__.py:82 - Event from server at /user/ (correlation_id=338): {u'status': u'OK', u'id': 204}
INFO 2019-09-24 16:06:22,061 Hardware.py:188 - Some mount points were ignored: / dev, /run, /dev/shm, /run/lock, /sys/fs/cgroup, /run/user/108, /run/user/0
INFO 2019-09-24 16:06:22,061 security.py:135 - Event to server at /reports/host_status (correlation_id=339): {'agentEnv': {'transparentHugePage': 'madvise', 'ho stHealth': {'agentTimeStampAtReporting': 1569321382048, 'liveServices': [{'statu s': 'Healthy', 'name': 'ntp or chrony', 'desc': ''}]}, 'reverseLookup': True, 'u mask': '18', 'hasUnlimitedJcePolicy': False, 'alternatives': [], 'firewallName': 'ufw', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': Fals e}, 'mounts': [{'available': '89429688', 'used': '7455292', 'percent': '8%', 'de vice': '/dev/sda1', 'mountpoint': '/', 'type': 'ext4', 'size': '102094168'}]}
INFO 2019-09-24 16:06:22,065 __init__.py:82 - Event from server at /user/ (correlation_id=339): {u'status': u'OK'}
INFO 2019-09-24 16:06:23,658 security.py:135 - Event to server at /heartbeat (correlation_id=340): {'id': 204}
INFO 2019-09-24 16:06:23,666 __init__.py:82 - Event from server at /user/ (correlation_id=340): {u'status': u'OK', u'id': 205}
INFO 2019-09-24 16:06:24,098 ComponentStatusExecutor.py:172 - Status command for HST_AGENT failed:
Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout:; stderr: /bin/sh: 1: /usr/sbin/hst: not found
INFO 2019-09-24 16:06:33,676 security.py:135 - Event to server at /heartbeat (correlation_id=341): {'id': 205}
INFO 2019-09-24 16:06:33,679 __init__.py:82 - Event from server at /user/ (correlation_id=341): {u'status': u'OK', u'id': 206}
INFO 2019-09-24 16:06:43,681 security.py:135 - Event to server at /heartbeat (correlation_id=342): {'id': 206}
INFO 2019-09-24 16:06:43,683 __init__.py:82 - Event from server at /user/ (correlation_id=342): {u'status': u'OK', u'id': 207}
INFO 2019-09-24 16:06:46,896 ComponentStatusExecutor.py:172 - Status command forHST_AGENT failed:
Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout: stderr: /bin/sh: 1: /usr/sbin/hst: not found
INFO 2019-09-24 16:06:53,685 security.py:135 - Event to server at /heartbeat (correlation_id=343): {'id': 207}
INFO 2019-09-24 16:06:53,689 __init__.py:82 - Event from server at /user/ (correlation_id=343): {u'status': u'OK', u'id': 208}
INFO 2019-09-24 16:07:03,690 security.py:135 - Event to server at /heartbeat (correlation_id=344): {'id': 208}
INFO 2019-09-24 16:07:03,693 __init__.py:82 - Event from server at /user/ (correlation_id=344): {u'status': u'OK', u'id': 209}
INFO 2019-09-24 16:07:09,801 ComponentStatusExecutor.py:172 - Status command for HST_AGENT failed:
Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout:; stderr: /bin/sh: 1: /usr/sbin/hst: not found
INFO 2019-09-24 16:07:13,694 security.py:135 - Event to server at /heartbeat (co rrelation_id=345): {'id': 209}
INFO 2019-09-24 16:07:13,697 __init__.py:82 - Event from server at /user/ (corre lation_id=345): {u'status': u'OK', u'id': 210}

avatar
Master Mentor

@irfangk1 Looks like the SmartSense service installation is failing for you.

 

Failed to execute command: dpkg-query -l | grep 'ii\s*smartsense-*' || apt-get -o Dpkg::Options::=--force-confdef --allow-unauthenticated --assume-yes install smartsense-hst || dpkg -i /var/lib/ambari-agent/cache/stacks/HDP/3.0/services/SMARTSENSE/package/files/deb/*.deb; Exit code: 1; stdout: ; stderr: E: dpkg was interrupted, you must manually run 'dpkg --configure -a' to correct the problem.
dpkg: error processing archive /var/lib/ambari-agent/cache/stacks/HDP/3.0/services/SMARTSENSE/package/files/deb/*.deb (--install):
cannot access archive: No such file or directory

 


Thats the reason later at Step9 the services are failing to start because smartsense service binaries (due to package installation failure) are not present.

 

INFO 2019-09-24 16:06:03,584 ComponentStatusExecutor.py:172 - Status command for HST_AGENT failed:
Failed to execute command: /usr/sbin/hst agent-status; Exit code: 127; stdout: ; stderr: /bin/sh: 1: /usr/sbin/hst: not found

 


Better to skip and proceed the Step9 by clicking Next/Proceed/OK/Complete kind of button in UI and then later verify on the Host where the SmartSense package is supposed to be install and failing to install.

 

Check if the repo is fine and if you are manually able to install the SmartSesne binary on that host?

It might be some ambari repo access issue on that node where the SmartSense installation failed.
Better to check and share the exact OS version and the ambari repo version.

 

Like based on the ambari version please get the correct repo something like:

# wget -O /etc/apt/sources.list.d/ambari.list <a href="http://public-repo-1.hortonworks.com/ambari/XXXXXXXXXXXXXX/updates/2.7.3.0/ambari.list" target="_blank">http://public-repo-1.hortonworks.com/ambari/XXXXXXXXXXXXXX/updates/2.7.3.0/ambari.list</a>
# apt-get clean all
# apt-get update

You can get the correct ambari REPO URL from the following kind of links:
https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/bk_ambari-upgrade-major/content/upgrade_ambari...

.

.