About mike_bronson7

jsensharma · ‎01-08-2018

@Michael Bronson For "Metrics Monitor" status you can alter the API call as following: # curl -i -H "X-Requested-By: ambari" -u admin:admin -X GET http://amb25101.example.com:8080/api/v1/clusters/plain_ambari/services/AMBARI_METRICS/components/METRICS_MONITOR?fields=host_components/HostRoles/host_name,host_components/HostRoles/state | grep -A 1 host_name For Yarn Resources like "NODEMANAGER" you can do it like: (Same logic you can apply for RESOURCEMANAGER, APP_TIMELINE_SERVER) # curl -i -H "X-Requested-By: ambari" -u admin:admin -X GET http://amb25101.example.com:8080/api/v1/clusters/plain_ambari/services/YARN/components/NODEMANAGER?fields=host_components/HostRoles/host_name,host_components/HostRoles/state | grep -A 1 host_name ..

asirna · ‎01-04-2018

@Michael Bronson, You can use these curl calls to run all the service checks and check the status To run service checks curl -ivk -H "X-Requested-By: ambari" -u {ambari-username}:{ambari-password} -X POST -d @payload.txt http://{ambari-server}:{ambari-port}/api/v1/clusters/{cluster-name}/request_schedules Sample response: { "resources": [ { "href": "http://<ambari-server>:8080/api/v1/clusters/<clustername>/request_schedules/68", "RequestSchedule": { "id": 68 // This is the request-schedule-id to be used for second call } } ] } <br> Note: Download the attached payload.txt to some folder and run the above command from the same folder. To get status of service checks curl -ivk -H "X-Requested-By: ambari" -u {ambari-username}:{ambari-password} -X GET http://{ambari-server}:{ambari-port}/api/v1/clusters/{cluster-name}/request_schedules/{request-schedule-id} To get the status of each service, iterate through batch_requests array in the response and look for 'request_status' inside each object. COMPLETED is for passed, FAILED for failed, ABORTED if service check is aborted.payload.txt Note: request-schedule-id for the second curl call is obtained from the response of 1st call. Thanks, Aditya

asirna · ‎01-04-2018

@Michael Bronson, Yes. You can use the second way to achieve your task. You can also use the below to check if namenode is in SafeMode and leave conditionally. su - hdfs -c "hdfs dfsadmin -safemode get" | grep ON if [ $? -ne 0 ] then su - hdfs -c "hdfs dfsadmin -safemode leave" fi To run the above script, put the content in a file say xyz.sh chmod +x xyz.sh ./xyz.sh Thanks, Aditya

kramakrishnan · ‎01-03-2018

@Michael Bronson curl -u {ambari_username}:{ambari_password} -H "X-Requested-By:ambari" -i GET http://localhost:8080/api/v1/clusters/cl1/components?fields=ServiceComponentInfo/state

achandra · ‎05-16-2018

This can happen if spark1 and spark2 are both running on same node.Try to kill the process. Then delete the service and add it to a separate node.It must work.

padraig_odowd · ‎12-21-2018

Thanks for quick reply. I meant to call a script to shutdown the ambari components after the server is issued a shutdown command, but before it actually shutdowns! But I found a solution to the issue anyway - I just needed to add this command "ExecStop=" to the systemd service files and all seems to work fine now. Thanks again for your quick reply..

xyao · ‎01-02-2018

@Michael Bronson It depends on whether the compoennts are going to use the new disks or not. If not, they don't need to restart. For those services that need to use the new disk. Some of them, such as HDFS datanode supported Hot-Swap, which means you can add disks by the following steps without a restart of datanode service. 1> changing the dfs.datanode.data.dir from hdfs-site.xml to include new disk locations (e.g., /data/disk2). <property> <name>dfs.datanode.data.dir</name> <value>/data/disk1,/data/disk2</value> </property> 2> Run hdfs CLI to reconfig datanode service without a restart. hdfs dfsadmin-reconfig datanode dn1.hdp.com:9820 start Other services might need a restart to use the new disks if Hot-Swap is not supported.

ssharma · ‎01-02-2018

@Michael Bronson You can use configs.py to achieve this. Run below command on ambari server host /var/lib/ambari-server/resources/scripts/configs.py --action get --host localhost --port <ambari_server_host> --protocol <ambari_protocol> --cluster <cluster_name> --config-type yarn-site (/var/lib/ambari-server/resources/scripts/configs.py --action get --host localhost --port 8080 --protocol http --cluster cl1 --config-type yarn-site) This will return the results in JSON format which are key value pairs. You can use this result to find the value of yarn.nodemanager.local-dirs and yarn.nodemanager.log-dirs Example : [root@ctr-e136-1513029738776-28711-01-000002 ~]# /var/lib/ambari-server/resources/scripts/configs.py --action get --host localhost --port 8080 --protocol http --cluster cl1 --config-type yarn-site 2018-01-02 12:14:46,879 INFO ### Performing "get" content: 2018-01-02 12:14:46,902 INFO ### on (Site:yarn-site, Tag:82207fb3-2c26-47fb-a092-d0b88e19fa66) { "properties": { "yarn.rm.system-metricspublisher.emit-container-events": "true", "yarn.timeline-service.http-authentication.kerberos.keytab": "/etc/security/keytabs/spnego.service.keytab", "yarn.timeline-service.http-authentication.signer.secret.provider.object": "", "yarn.resourcemanager.hostname": "ctr-e136-1513029738776-28711-01-000004.hwx.site", "yarn.node-labels.enabled": "false", "yarn.resourcemanager.scheduler.monitor.enable": "false", "yarn.nodemanager.aux-services.spark2_shuffle.class": "org.apache.spark.network.yarn.YarnShuffleService", "yarn.timeline-service.http-authentication.signature.secret.file": "", "yarn.timeline-service.bind-host": "0.0.0.0", "hadoop.registry.secure": "true", "yarn.resourcemanager.ha.enabled": "true", "hadoop.registry.dns.bind-port": "5353", "yarn.nodemanager.runtime.linux.docker.privileged-containers.acl": "", "yarn.timeline-service.webapp.address": "ctr-e136-1513029738776-28711-01-000004.hwx.site:8188", "yarn.nodemanager.principal": "nm/_HOST@EXAMPLE.COM", "yarn.timeline-service.enabled": "false", "yarn.nodemanager.recovery.enabled": "true", "yarn.timeline-service.entity-group-fs-store.group-id-plugin-classpath": "{\"HDP\":\"/usr/hdp\"}/${hdp.version}/spark/hdpLib/*", "yarn.timeline-service.http-authentication.type": "kerberos", "yarn.nodemanager.container-metrics.unregister-delay-ms": "60000", "yarn.nodemanager.keytab": "/etc/security/keytabs/nm.service.keytab", "yarn.timeline-service.address": "ctr-e136-1513029738776-28711-01-000004.hwx.site:10200", "yarn.timeline-service.entity-group-fs-store.summary-store": "org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore", "yarn.timeline-service.entity-group-fs-store.app-cache-size": "10", "yarn.nodemanager.aux-services.spark2_shuffle.classpath": "{{stack_root}}/${hdp.version}/spark2/aux/*", "yarn.resourcemanager.webapp.spnego-principal": "HTTP/_HOST@EXAMPLE.COM", "yarn.resourcemanager.am.max-attempts": "20", "\nyarn.webapp.api-service.enable\n": "true", "yarn.nodemanager.log-aggregation.debug-enabled": "false", "yarn.timeline-service.http-authentication.proxyuser.*.users": "", "yarn.timeline-service.http-authentication.proxyuser.*.hosts": "", "yarn.scheduler.maximum-allocation-vcores": "1", "yarn.resourcemanager.system-metrics-publisher.enabled": "true", "yarn.nodemanager.vmem-pmem-ratio": "2.1", "yarn.resourcemanager.nodes.exclude-path": "/etc/hadoop/conf/yarn.exclude", "yarn.timeline-service.http-authentication.cookie.path": "", "yarn.resourcemanager.system-metrics-publisher.dispatcher.pool-size": "10", "yarn.log.server.url": "http://ctr-e136-1513029738776-28711-01-000004.hwx.site:19888/jobhistory/logs", "yarn.nodemanager.webapp.spnego-principal": "HTTP/_HOST@EXAMPLE.COM", "yarn.timeline-service.keytab": "/etc/security/keytabs/yarn.service.keytab", "\nyarn.nodemanager.runtime.linux.docker.allowed-container-networks\n": "host,none,bridge", "yarn.resourcemanager.webapp.delegation-token-auth-filter.enabled": "false", "hadoop.registry.dns.domain-name": "hwx.site", "yarn.timeline-service.entity-group-fs-store.active-dir": "/ats/active/", "\nyarn.nodemanager.runtime.linux.docker.default-container-network\n": "host", "yarn.resourcemanager.principal": "rm/_HOST@EXAMPLE.COM", "yarn.nodemanager.local-dirs": "/grid/0/hadoop/yarn/local", "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage": "false", "yarn.nodemanager.remote-app-log-dir-suffix": "logs", "yarn.log.server.web-service.url": "http://ctr-e136-1513029738776-28711-01-000004.hwx.site:8188/ws/v1/applicationhistory", "\nyarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users\n": "false", "yarn.resourcemanager.address": "ctr-e136-1513029738776-28711-01-000004.hwx.site:8050", "yarn.resourcemanager.zk-num-retries": "1000", "yarn.timeline-service.http-authentication.token.validity": "", "yarn.resourcemanager.ha.automatic-failover.zk-base-path": "/yarn-leader-election", "yarn.resourcemanager.proxy-user-privileges.enabled": "true", "yarn.application.classpath": "$HADOOP_CONF_DIR,{{hadoop_home}}/*,{{hadoop_home}}/lib/*,{{stack_root}}/current/hadoop-hdfs-client/*,{{stack_root}}/current/hadoop-hdfs-client/lib/*,{{stack_root}}/current/hadoop-yarn-client/*,{{stack_root}}/current/hadoop-yarn-client/lib/*", "yarn.timeline-service.ttl-ms": "2678400000", "yarn.timeline-service.http-authentication.proxyuser.ambari-server.hosts": "ctr-e136-1513029738776-28711-01-000002.hwx.site", "yarn.nodemanager.container-monitor.interval-ms": "3000", "yarn.node-labels.fs-store.retry-policy-spec": "2000, 500", "yarn.resourcemanager.zk-acl": "sasl:rm:rwcda", "yarn.timeline-service.leveldb-state-store.path": "/grid/0/hadoop/yarn/timeline", "hadoop.registry.jaas.context": "Client", "yarn.scheduler.capacity.ordering-policy.priority-utilization.underutilized-preemption.enabled": "false", "yarn.resourcemanager.webapp.https.address": "ctr-e136-1513029738776-28711-01-000004.hwx.site:8088", "yarn.log-aggregation-enable": "true", "yarn.nodemanager.delete.debug-delay-sec": "3600", "yarn.resourcemanager.bind-host": "0.0.0.0", "yarn.timeline-service.store-class": "org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore", "yarn.resourcemanager.webapp.spnego-keytab-file": "/etc/security/keytabs/spnego.service.keytab", "yarn.timeline-service.client.retry-interval-ms": "1000", "yarn.system-metricspublisher.enabled": "true", "yarn.timeline-service.entity-group-fs-store.group-id-plugin-classes": "org.apache.tez.dag.history.logging.ats.TimelineCachePluginImpl", "hadoop.registry.zk.quorum": "ctr-e136-1513029738776-28711-01-000007.hwx.site:2181,ctr-e136-1513029738776-28711-01-000003.hwx.site:2181,ctr-e136-1513029738776-28711-01-000006.hwx.site:2181,ctr-e136-1513029738776-28711-01-000005.hwx.site:2181", "yarn.nodemanager.aux-services": "mapreduce_shuffle", "\nyarn.nodemanager.runtime.linux.allowed-runtimes\n": "default,docker", "yarn.timeline-service.http-authentication.proxyuser.ambari-server.groups": "*", "yarn.nodemanager.aux-services.mapreduce_shuffle.class": "org.apache.hadoop.mapred.ShuffleHandler", "hadoop.registry.dns.enabled": "true", "yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage": "90", "yarn.resourcemanager.zk-timeout-ms": "10000", "yarn.resourcemanager.fs.state-store.uri": " ", "yarn.nodemanager.linux-container-executor.group": "hadoop", "yarn.nodemanager.remote-app-log-dir": "/app-logs", "yarn.nodemanager.aux-services.spark_shuffle.classpath": "{{stack_root}}/${hdp.version}/spark/aux/*", "yarn.resourcemanager.keytab": "/etc/security/keytabs/rm.service.keytab", "yarn.timeline-service.ttl-enable": "true", "yarn.timeline-service.entity-group-fs-store.cleaner-interval-seconds": "3600", "yarn.resourcemanager.fs.state-store.retry-policy-spec": "2000, 500", "yarn.timeline-service.generic-application-history.store-class": "org.apache.hadoop.yarn.server.applicationhistoryservice.NullApplicationHistoryStore", "yarn.resourcemanager.webapp.address.rm1": "ctr-e136-1513029738776-28711-01-000004.hwx.site:8088", "hadoop.registry.dns.zone-mask": "255.255.255.0", "yarn.nodemanager.disk-health-checker.min-healthy-disks": "0.25", "yarn.resourcemanager.state-store.max-completed-applications": "${yarn.resourcemanager.max-completed-applications}", "yarn.resourcemanager.webapp.address.rm2": "ctr-e136-1513029738776-28711-01-000003.hwx.site:8088", "yarn.resourcemanager.work-preserving-recovery.enabled": "true", "yarn.resourcemanager.resource-tracker.address": "ctr-e136-1513029738776-28711-01-000004.hwx.site:8025", "yarn.nodemanager.health-checker.script.timeout-ms": "60000", "yarn.resourcemanager.scheduler.class": "org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler", "yarn.nodemanager.resource.memory-mb": "12288", "yarn.timeline-service.http-authentication.kerberos.name.rules": "", "yarn.nodemanager.resource.cpu-vcores": "1", "yarn.timeline-service.http-authentication.signature.secret": "", "yarn.scheduler.maximum-allocation-mb": "12288", "yarn.resourcemanager.monitor.capacity.preemption.total_preemption_per_round": "0.17", "yarn.nodemanager.resource.percentage-physical-cpu-limit": "80", "yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb": "1000", "yarn.resourcemanager.proxyuser.*.groups": "", "yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds": "3600", "yarn.timeline-service.principal": "yarn/_HOST@EXAMPLE.COM", "yarn.timeline-service.state-store-class": "org.apache.hadoop.yarn.server.timeline.recovery.LeveldbTimelineStateStore", "yarn.node-labels.fs-store.root-dir": "/system/yarn/node-labels", "yarn.resourcemanager.hostname.rm1": "ctr-e136-1513029738776-28711-01-000004.hwx.site", "yarn.resourcemanager.hostname.rm2": "ctr-e136-1513029738776-28711-01-000003.hwx.site", "yarn.resourcemanager.proxyuser.*.hosts": "", "yarn.resourcemanager.webapp.address": "ctr-e136-1513029738776-28711-01-000004.hwx.site:8088", "yarn.scheduler.minimum-allocation-vcores": "1", "yarn.nodemanager.health-checker.interval-ms": "135000", "yarn.nodemanager.admin-env": "MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX", "yarn.nodemanager.vmem-check-enabled": "false", "yarn.acl.enable": "true", "yarn.timeline-service.leveldb-timeline-store.read-cache-size": "104857600", "yarn.nodemanager.log.retain-seconds": "604800", "yarn.client.nodemanager-connect.max-wait-ms": "60000", "yarn.timeline-service.http-authentication.simple.anonymous.allowed": "true", "\nyarn.nodemanager.runtime.linux.docker.privileged-containers.allowed\n": "false", "yarn.scheduler.minimum-allocation-mb": "1024", "yarn.timeline-service.leveldb-timeline-store.start-time-read-cache-size": "10000", "yarn.resourcemanager.monitor.capacity.preemption.natural_termination_factor": "1", "yarn.resourcemanager.ha.rm-ids": "rm1,rm2", "yarn.timeline-service.http-authentication.signer.secret.provider": "", "yarn.resourcemanager.connect.max-wait.ms": "900000", "yarn.resourcemanager.proxyuser.*.users": "", "yarn.timeline-service.http-authentication.cookie.domain": "", "yarn.timeline-service.http-authentication.proxyuser.*.groups": "", "yarn.http.policy": "HTTP_ONLY", "yarn.nodemanager.runtime.linux.docker.capabilities": "\nCHOWN,DAC_OVERRIDE,FSETID,FOWNER,MKNOD,NET_RAW,SETGID,SETUID,SETFCAP,\nSETPCAP,NET_BIND_SERVICE,SYS_CHROOT,KILL,AUDIT_WRITE", "yarn.timeline-service.version": "2.0", "yarn.resourcemanager.zk-address": "ctr-e136-1513029738776-28711-01-000007.hwx.site:2181,ctr-e136-1513029738776-28711-01-000006.hwx.site:2181,ctr-e136-1513029738776-28711-01-000005.hwx.site:2181", "yarn.nodemanager.recovery.dir": "{{yarn_log_dir_prefix}}/nodemanager/recovery-state", "yarn.nodemanager.container-executor.class": "org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor", "yarn.resourcemanager.store.class": "org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore", "yarn.timeline-service.entity-group-fs-store.retain-seconds": "604800", "yarn.nodemanager.webapp.spnego-keytab-file": "/etc/security/keytabs/spnego.service.keytab", "yarn.resourcemanager.recovery.enabled": "true", "yarn.timeline-service.leveldb-timeline-store.path": "/grid/0/hadoop/yarn/timeline", "hadoop.registry.system.accounts": "sasl:yarn,sasl:jhs,sasl:hdfs,sasl:rm,sasl:hive", "yarn.timeline-service.client.max-retries": "30", "yarn.resourcemanager.scheduler.address": "ctr-e136-1513029738776-28711-01-000004.hwx.site:8030", "yarn.log-aggregation.retain-seconds": "2592000", "yarn.nodemanager.address": "0.0.0.0:25454", "hadoop.registry.rm.enabled": "false", "yarn.timeline-service.leveldb-timeline-store.ttl-interval-ms": "300000", "yarn.resourcemanager.work-preserving-recovery.scheduling-wait-ms": "10000", "yarn.resourcemanager.zk-state-store.parent-path": "/rmstore", "yarn.nodemanager.log-aggregation.compression-type": "gz", "yarn.timeline-service.http-authentication.kerberos.principal": "HTTP/_HOST@EXAMPLE.COM", "yarn.nodemanager.log-aggregation.num-log-files-per-app": "30", "hadoop.registry.client.auth": "kerberos", "yarn.timeline-service.recovery.enabled": "true", "yarn.nodemanager.bind-host": "0.0.0.0", "yarn.resourcemanager.zk-retry-interval-ms": "1000", "manage.include.files": "false", "yarn.nodemanager.recovery.supervised": "true", "yarn.admin.acl": "yarn,dr.who", "yarn.resourcemanager.cluster-id": "yarn-cluster", "yarn.nodemanager.log-dirs": "/grid/0/hadoop/yarn/log", "yarn.timeline-service.entity-group-fs-store.scan-interval-seconds": "60", "yarn.timeline-service.leveldb-timeline-store.start-time-write-cache-size": "10000", "yarn.nodemanager.aux-services.spark_shuffle.class": "org.apache.spark.network.yarn.YarnShuffleService", "hadoop.registry.dns.zone-subnet": "172.17.0.0", "yarn.client.nodemanager-connect.retry-interval-ms": "10000", "yarn.resourcemanager.admin.address": "ctr-e136-1513029738776-28711-01-000004.hwx.site:8141", "yarn.timeline-service.webapp.https.address": "ctr-e136-1513029738776-28711-01-000004.hwx.site:8190", "yarn.resourcemanager.connect.retry-interval.ms": "30000", "yarn.timeline-service.entity-group-fs-store.done-dir": "/ats/done/" } }

JordanMoore · ‎01-02-2018

Ambari itself doesn't know those disks are mounted until you edit the host configurations for HDFS/YARN and update the data directory configurations. The Ambari Alert check will run periodically to see if those configured disks are mounted, then the agent will update the dashboard.

daniel_arguelle · ‎11-10-2018

As you can see, the error message sais to check the file /var/lib/ambari-agent/data/datanode/dfs_data_dir_mount.hist This files stores the last mount point for each hdfs folder. In your case, seems that you are trying to mount the HDFS folders on different paths so datanode doesn't start to prevent data loss. Fix the file to point to the new mount points and start the datanode.

Online	Offline
Last Visited	‎08-27-2024 09:17 AM

Member Since	‎08-08-2017 09:40 AM
Last Visited	‎08-27-2024 09:17 AM
Posts	1,652
Kudos received	29

Cloudera Community

Re: how to find number of CPU core on datanode ma...

Re: postgresql + ambari server failed to open port...

Re: how to stop the thrift servers by REST API

Re: namenode is in safe mode

Re: Directory /grid/sdg/hadoop/hdfs/data became un...

Re: how to get status of components on specific da...

Re: Run service checks on each of ambari services

Re: how to verify if name node is in safe mode

Re: how to capture all ambari services/components ...

Re: Spark2 Thrift Server not start on ambari clust...

Re: how to stop all components on data-node machin...

Re: is it necessary to stop the components on each...

Re: how to print the values of yarn.nodemanager.lo...

Re: what is the status from ambari GUI that approv...

Re: Directory /grid/sdg/hadoop/hdfs/data became un...