Member since
01-14-2016
49
Posts
4
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2960 | 10-21-2019 09:30 AM | |
1378 | 10-11-2018 09:23 PM | |
1900 | 09-13-2018 06:39 PM | |
2418 | 04-27-2018 02:51 PM | |
711 | 06-01-2017 12:35 AM |
10-28-2019
02:39 PM
Hello Shelton. I've followed entire given steps and service still is not coming up. Below attached outputs. From: /var/lib/ambari-agent/data/errors-31848.txt resource_management.core.exceptions.ExecuteTimeoutException: Execution of 'ambari-sudo.sh su yarnats -l -s /bin/bash -c 'export PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/puppetlabs/bin:/var/lib/ambari-agent:/var/lib/ambari-agent'"'"' ; sleep 10;export HBASE_CLASSPATH_PREFIX=/usr/hdp/3.1.0.0-78/hadoop-yarn/timelineservice/*; /usr/hdp/3.1.0.0-78/hbase/bin/hbase --config /usr/hdp/3.1.0.0-78/hadoop/conf/embedded-yarn-ats-hbase org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator -Dhbase.client.retries.number=35 -create -s'' was killed due timeout after 300 seconds From: /var/lib/ambari-agent/data/output-31848.txt 2019-10-22 16:41:17,992 WARN [main-EventThread] coordination.ZKSplitLogManagerCoordination$CreateRescanAsyncCallback: rc=NONODE for /atsv2-hbase-unsecure/splitWAL/RESCAN remaining retries=9223372036854744889
2019-10-22 16:41:17,992 WARN [main-EventThread] coordination.ZKSplitLogManagerCoordination$CreateRescanAsyncCallback: rc=NONODE for /atsv2-hbase-unsecure/splitWAL/RESCAN remaining retries=9223372036854735924
2019-10-22 16:41:17,992 WARN [main-EventThread] coordination.ZKSplitLogManagerCoordination$CreateRescanAsyncCallback: rc=NONODE for /atsv2-hbase-unsecure/splitWAL/RESCAN remaining retries=9223372036854772106
2019-10-22 16:41:17,992 WARN [main-EventThread] coordination.ZKSplitLogManagerCoordination$CreateRescanAsyncCallback: rc=NONODE for /atsv2-hbase-unsecure/splitWAL/RESCAN remaining retries=9223372036854768736
2019-10-22 16:41:17,992 WARN [main-EventThread] coordination.ZKSplitLogManagerCoordination$CreateRescanAsyncCallback: rc=NONODE for /atsv2-hbase-unsecure/splitWAL/RESCAN remaining retries=9223372036854749025
==> /usr/logs/hadoop-yarn/embedded-yarn-ats-hbase/gc.log-201910110639 <==
Java HotSpot(TM) 64-Bit Server VM (25.60-b23) for linux-amd64 JRE (1.8.0_60-b27), built on Aug 4 2015 12:19:40 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 131732324k(11848752k free), swap 8388604k(8279292k free)
CommandLine flags: -XX:ErrorFile=/usr/logs/hadoop-yarn/embedded-yarn-ats-hbase/hs_err_pid%p.log -XX:InitialHeapSize=2107717184 -XX:MaxHeapSize=3435134976 -XX:MaxNewSize=1145044992 -XX:MaxTenuringThreshold=6 -XX:OldPLABSize=16 -XX:OnOutOfMemoryError=kill -9 %p -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
Heap
par new generation total 618048K, used 197776K [0x00000006f3400000, 0x000000071d2a0000, 0x0000000737800000)
eden space 549376K, 36% used [0x00000006f3400000, 0x00000006ff5243a8, 0x0000000714c80000)
from space 68672K, 0% used [0x0000000714c80000, 0x0000000714c80000, 0x0000000718f90000)
to space 68672K, 0% used [0x0000000718f90000, 0x0000000718f90000, 0x000000071d2a0000)
concurrent mark-sweep generation total 1373568K, used 0K [0x0000000737800000, 0x000000078b560000, 0x00000007c0000000)
Metaspace used 11629K, capacity 11810K, committed 11904K, reserved 1060864K
class space used 1251K, capacity 1316K, committed 1408K, reserved 1048576K
==> /usr/logs/hadoop-yarn/embedded-yarn-ats-hbase/gc.log-201910100851 <==
Java HotSpot(TM) 64-Bit Server VM (25.60-b23) for linux-amd64 JRE (1.8.0_60-b27), built on Aug 4 2015 12:19:40 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 131732324k(1591264k free), swap 8388604k(8280060k free)
CommandLine flags: -XX:CMSInitiatingOccupancyFraction=70 -XX:ErrorFile=/usr/logs/hadoop-yarn/embedded-yarn-ats-hbase/hs_err_pid%p.log -XX:InitialHeapSize=3435134976 -XX:MaxHeapSize=3435134976 -XX:MaxNewSize=1145044992 -XX:MaxTenuringThreshold=6 -XX:NewSize=1145044992 -XX:OldPLABSize=16 -XX:OldSize=2290089984 -XX:OnOutOfMemoryError=kill -9 %p -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:ReservedCodeCacheSize=268435456 -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
2019-10-10T08:51:22.325-0700: 2.213: [GC (CMS Initial Mark) [1 CMS-initial-mark: 0K(2236416K)] 715687K(3242816K), 0.1832180 secs] [Times: user=0.48 sys=0.07, real=0.19 secs]
2019-10-10T08:51:22.508-0700: 2.396: [CMS-concurrent-mark-start]
2019-10-10T08:51:22.509-0700: 2.397: [CMS-concurrent-mark: 0.001/0.001 secs] [Times: user=0.01 sys=0.01, real=0.00 secs]
2019-10-10T08:51:22.509-0700: 2.397: [CMS-concurrent-preclean-start]
2019-10-10T08:51:22.513-0700: 2.400: [CMS-concurrent-preclean: 0.003/0.003 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
2019-10-10T08:51:22.513-0700: 2.400: [CMS-concurrent-abortable-preclean-start]
2019-10-10T08:51:22.827-0700: 2.715: [GC (Allocation Failure) 2019-10-10T08:51:22.827-0700: 2.715: [ParNew: 894592K->37233K(1006400K), 0.0334809 secs] 894592K->37233K(3242816K), 0.0335760 secs] [Times: user=0.17 sys=0.03, real=0.03 secs]
Heap
par new generation total 1006400K, used 577717K [0x00000006f3400000, 0x0000000737800000, 0x0000000737800000)
eden space 894592K, 60% used [0x00000006f3400000, 0x00000007143d0fc8, 0x0000000729da0000)
from space 111808K, 33% used [0x0000000730ad0000, 0x0000000732f2c758, 0x0000000737800000)
to space 111808K, 0% used [0x0000000729da0000, 0x0000000729da0000, 0x0000000730ad0000)
concurrent mark-sweep generation total 2236416K, used 0K [0x0000000737800000, 0x00000007c0000000, 0x00000007c0000000)
Metaspace used 52260K, capacity 52701K, committed 53168K, reserved 1095680K
class space used 5905K, capacity 6041K, committed 6096K, reserved 1048576K
2019-10-10T08:51:24.359-0700: 4.247: [CMS-concurrent-abortable-preclean: 1.100/1.847 secs] [Times: user=4.75 sys=0.27, real=1.85 secs]
Command failed after 1 tries From: yarn-timelineserver-gc.log Total 180427 20331968
, 0.0109813 secs]
25.966: [GC (Allocation Failure) [PSYoungGen: 766976K->26024K(1047040K)] 786831K->45895K(2136576K), 0.0205774 secs] [Times: user=0.18 sys=0.02, real=0.02 secs]
27.877: [GC (Allocation Failure) [PSYoungGen: 987560K->37814K(1176576K)] 1007431K->57702K(2266112K), 0.0452135 secs] [Times: user=0.31 sys=0.03, real=0.05 secs]
29.872: [GC (Allocation Failure) [PSYoungGen: 1128886K->40013K(1176576K)] 1148774K->59908K(2266112K), 0.0376384 secs] [Times: user=0.25 sys=0.02, real=0.04 secs]
31.621: [GC (Allocation Failure) [PSYoungGen: 1131085K->41607K(1708032K)] 1150980K->61510K(2797568K), 0.0426743 secs] [Times: user=0.19 sys=0.02, real=0.04 secs]
34.381: [GC (Allocation Failure) [PSYoungGen: 1702023K->52721K(1713152K)] 1721926K->75113K(2802688K), 0.0671733 secs] [Times: user=0.32 sys=0.06, real=0.07 secs]
544.663: [GC (Allocation Failure) [PSYoungGen: 1713137K->24633K(2550784K)] 1735529K->55561K(3640320K), 0.0502315 secs] [Times: user=0.37 sys=0.08, real=0.05 secs]
1744.725: [GC (Allocation Failure) [PSYoungGen: 2550329K->6583K(2657792K)] 2581257K->37803K(3747328K), 0.0109360 secs] [Times: user=0.06 sys=0.05, real=0.01 secs]
3364.582: [GC (Allocation Failure) [PSYoungGen: 2603959K->7333K(2513408K)] 2635179K->38561K(3602944K), 0.0106033 secs] [Times: user=0.05 sys=0.05, real=0.01 secs]
4564.508: [GC (Allocation Failure) [PSYoungGen: 2513061K->7397K(2425856K)] 2544289K->38633K(3515392K), 0.0098975 secs] [Times: user=0.05 sys=0.05, real=0.01 secs]
5944.468: [GC (Allocation Failure) [PSYoungGen: 2425573K->7432K(2342400K)] 2456809K->38676K(3431936K), 0.0100541 secs] [Times: user=0.05 sys=0.04, real=0.01 secs]
6904.427: [GC (Allocation Failure) [PSYoungGen: 2342152K->7814K(2263040K)] 2373396K->39065K(3352576K), 0.0100246 secs] [Times: user=0.06 sys=0.05, real=0.01 secs]
7624.583: [GC (Allocation Failure) [PSYoungGen: 2262662K->7335K(2186240K)] 2293913K->38595K(3275776K), 0.0126832 secs] [Times: user=0.07 sys=0.03, real=0.01 secs]
8524.740: [GC (Allocation Failure) [PSYoungGen: 2185895K->7238K(2113536K)] 2217155K->38505K(3203072K), 0.0110849 secs] [Times: user=0.06 sys=0.05, real=0.01 secs]
9604.461: [GC (Allocation Failure) [PSYoungGen: 2113094K->7415K(2043904K)] 2144361K->38690K(3133440K), 0.0187939 secs] [Times: user=0.11 sys=0.07, real=0.02 secs]
10864.545: [GC (Allocation Failure) [PSYoungGen: 2043639K->7287K(1977344K)] 2074914K->38570K(3066880K), 0.0131232 secs] [Times: user=0.12 sys=0.04, real=0.01 secs] From: yarn-timelineserver-gc.log 2019-10-28 14:35:06,753 WARN timeline.EntityGroupFSTimelineStore (LogInfo.java:doParse(208)) - Error putting entity: dag_1572286585508_0006_331 (TEZ_DAG_ID): 6
2019-10-28 14:35:06,753 WARN timeline.EntityGroupFSTimelineStore (LogInfo.java:doParse(208)) - Error putting entity: dag_1572286585508_0006_332 (TEZ_DAG_ID): 6
2019-10-28 14:35:06,754 WARN timeline.EntityGroupFSTimelineStore (LogInfo.java:doParse(208)) - Error putting entity: dag_1572286585508_0006_332 (TEZ_DAG_ID): 6
2019-10-28 14:35:06,754 WARN timeline.EntityGroupFSTimelineStore (LogInfo.java:doParse(208)) - Error putting entity: dag_1572286585508_0006_333 (TEZ_DAG_ID): 6
2019-10-28 14:35:06,754 WARN timeline.EntityGroupFSTimelineStore (LogInfo.java:doParse(208)) - Error putting entity: dag_1572286585508_0006_333 (TEZ_DAG_ID): 6
2019-10-28 14:35:06,755 WARN timeline.EntityGroupFSTimelineStore (LogInfo.java:doParse(208)) - Error putting entity: dag_1572286585508_0006_334 (TEZ_DAG_ID): 6
2019-10-28 14:35:06,755 WARN timeline.EntityGroupFSTimelineStore (LogInfo.java:doParse(208)) - Error putting entity: dag_1572286585508_0006_334 (TEZ_DAG_ID): 6
2019-10-28 14:35:06,755 INFO timeline.LogInfo (LogInfo.java:parseForStore(116)) - Parsed 1338 entities from hdfs://hdpnndev/ats/active/application_1572286585508_0006/appattempt_1572286585508_0006_000001/summarylog-appattempt_1572286585508_0006_000001 in 314 msec From: hadoop-yarn-resourcemanager-server.log TARGET=ClientRMService RESULT=SUCCESS
2019-10-28 14:36:08,043 INFO allocator.AbstractContainerAllocator (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(129)) - assignedContainer application attempt=appattempt_1572286585508_0005_000001 container=null queue=batchq1 clusterResource=<memory:411648, vCores:128> type=RACK_LOCAL requestedPartition=
2019-10-28 14:36:08,043 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(490)) - container_e247_1572286585508_0005_01_000398 Container Transitioned from NEW to ALLOCATED
2019-10-28 14:36:08,043 INFO fica.FiCaSchedulerNode (FiCaSchedulerNode.java:allocateContainer(169)) - Assigned container container_e247_1572286585508_0005_01_000398 of capacity <memory:3072, vCores:1> on host server:45454, which has 6 containers, <memory:70656, vCores:6> used and <memory:32256, vCores:26> available after allocation
2019-10-28 14:36:08,043 INFO resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(200)) - USER=hive OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1572286585508_0005 CONTAINERID=container_e247_1572286585508_0005_01_000398 RESOURCE=<memory:3072, vCores:1>
2019-10-28 14:36:08,043 INFO capacity.ParentQueue (ParentQueue.java:apply(1336)) - assignedContainer queue=batch usedCapacity=0.14925392 absoluteUsedCapacity=0.11940298 used=<memory:49152, vCores:13> cluster=<memory:411648, vCores:128>
2019-10-28 14:36:08,043 INFO capacity.ParentQueue (ParentQueue.java:apply(1336)) - assignedContainer queue=root usedCapacity=0.40298507 absoluteUsedCapacity=0.40298507 used=<memory:165888, vCores:17> cluster=<memory:411648, vCores:128>
2019-10-28 14:36:08,043 INFO capacity.CapacityScheduler (CapacityScheduler.java:tryCommit(2900)) - Allocation proposal accepted
2019-10-28 14:36:08,103 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(490)) - container_e247_1572286585508_0005_01_000398 Container Transitioned from ALLOCATED to ACQUIRED
2019-10-28 14:36:08,300 INFO allocator.AbstractContainerAllocator (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(129)) - assignedContainer application attempt=appattempt_1572286585508_0006_000001 container=null queue=batchq1 clusterResource=<memory:411648, vCores:128> type=OFF_SWITCH requestedPartition=
2019-10-28 14:36:08,300 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(490)) - container_e247_1572286585508_0006_01_000647 Container Transitioned from NEW to ALLOCATED
2019-10-28 14:36:08,300 INFO fica.FiCaSchedulerNode (FiCaSchedulerNode.java:allocateContainer(169)) - Assigned container container_e247_1572286585508_0006_01_000647 of capacity <memory:3072, vCores:1> on host server:45454, which has 5 containers, <memory:18432, vCores:5> used and <memory:84480, vCores:27> available after allocation
2019-10-28 14:36:08,300 INFO resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(200)) - USER=hive OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1572286585508_0006 CONTAINERID=container_e247_1572286585508_0006_01_000647 RESOURCE=<memory:3072, vCores:1>
2019-10-28 14:36:08,300 INFO capacity.ParentQueue (ParentQueue.java:apply(1336)) - assignedContainer queue=batch usedCapacity=0.15858229 absoluteUsedCapacity=0.12686567 used=<memory:52224, vCores:14> cluster=<memory:411648, vCores:128>
2019-10-28 14:36:08,300 INFO capacity.ParentQueue (ParentQueue.java:apply(1336)) - assignedContainer queue=root usedCapacity=0.41044775 absoluteUsedCapacity=0.41044775 used=<memory:168960, vCores:18> cluster=<memory:411648, vCores:128>
2019-10-28 14:36:08,300 INFO capacity.CapacityScheduler (CapacityScheduler.java:tryCommit(2900)) - Allocation proposal accepted
2019-10-28 14:36:08,354 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(490)) - container_e247_1572286585508_0005_01_000398 Container Transitioned from ACQUIRED to RELEASED
2019-10-28 14:36:08,354 INFO resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(200)) - USER=hive IP=10.10.81.14 OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1572286585508_0005 CONTAINERID=container_e247_1572286585508_0005_01_000398 RESOURCE=<memory:3072, vCores:1>
2019-10-28 14:36:08,354 INFO scheduler.AppSchedulingInfo (AppSchedulingInfo.java:updatePendingResources(367)) - checking for deactivate of application :application_1572286585508_0005
2019-10-28 14:36:08,485 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(490)) - container_e247_1572286585508_0006_01_000647 Container Transitioned from ALLOCATED to ACQUIRED
2019-10-28 14:36:08,736 INFO scheduler.AppSchedulingInfo (AppSchedulingInfo.java:updatePendingResources(367)) - checking for deactivate of application :application_1572286585508_0006
2019-10-28 14:36:08,987 INFO scheduler.AppSchedulingInfo (AppSchedulingInfo.java:updatePendingResources(367)) - checking for deactivate of application :application_1572286585508_0006 This are not the complete logs, just a glimpse. I hope it helps to come up with any idea. It gives me the impression it's heap memory issue. But... AppTimelineServer Java heap size = 8G , therefore any thought is appreciated. Regards!
... View more
10-25-2019
07:45 AM
Ambari 2.7.3 HDP 3.1.0 Hello team. My service went down 15 days ago and I'm still not able to fix this issue. If someone has face it before, please let me know. From last logs: Caused by: java.net.ConnectException: Call to atsserver/ip:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: atsserver/ip:17020 at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:165) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:95) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:410) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:406) at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:103) at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:118) at org.apache.hadoop.hbase.ipc.BufferCallBeforeInitHandler.userEventTriggered(BufferCallBeforeInitHandler.java:92) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:329) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:315) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:307) at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.userEventTriggered(DefaultChannelPipeline.java:1377) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:329) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:315) at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireUserEventTriggered(DefaultChannelPipeline.java:929) at org.apache.hadoop.hbase.ipc.NettyRpcConnection.failInit(NettyRpcConnection.java:179) at org.apache.hadoop.hbase.ipc.NettyRpcConnection.access$500(NettyRpcConnection.java:71) at org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:269) at org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:263) at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507) at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500) at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479) at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420) at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122) at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:327) at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:343) at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:633) at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580) at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497) at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) at org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) ... 1 more Caused by: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: atserver/ip:17020 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.hbase.thirdparty.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:323) at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ... 7 more Caused by: java.net.ConnectException: Connection refused ... 11 more 2019-10-10 10:03:05,448 ERROR reader.TimelineReaderServer (LogAdapter.java:error(75)) - RECEIVED SIGNAL 15: SIGTERM 2019-10-10 10:03:05,457 INFO handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.w.WebAppContext@7357a011{/,null,UNAVAILABLE}{/timeline} 2019-10-10 10:03:05,460 INFO server.AbstractConnector (AbstractConnector.java:doStop(318)) - Stopped ServerConnector@1bae316d{HTTP/1.1,[http/1.1]}{0.0.0.0:8198} 2019-10-10 10:03:05,460 INFO handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.s.ServletContextHandler@4bff7da0{/static,jar:file:/usr/hdp/3.1.0.0-78/hadoop-yarn/hadoop-yarn-common-3.1.1.3.1.0.0-78.jar!/webapps/static,UNAVAILABLE} 2019-10-10 10:03:05,460 INFO handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.s.ServletContextHandler@3e62d773{/logs,file:///usr/logs/hadoop-yarn/yarn/,UNAVAILABLE} 2019-10-10 10:03:05,462 INFO storage.HBaseTimelineReaderImpl (HBaseTimelineReaderImpl.java:serviceStop(108)) - closing the hbase Connection 2019-10-10 10:03:05,462 INFO zookeeper.ReadOnlyZKClient (ReadOnlyZKClient.java:close(342)) - Close zookeeper connection 0x783a467b to zookeeper:2181,zookeeper:2181,zookeeper:2181,zookeeper:2181,zookeeper:2181 2019-10-10 10:03:05,465 INFO zookeeper.ReadOnlyZKClient (ReadOnlyZKClient.java:close(342)) - Close zookeeper connection 0x0ba2f4ec to zookeeper quorum 2019-10-10 10:03:05,467 INFO zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x26daa4f60630022 closed 2019-10-10 10:03:05,467 INFO zookeeper.ClientCnxn (ClientCnxn.java:run(524)) - EventThread shut down 2019-10-10 10:03:05,468 INFO reader.TimelineReaderServer (LogAdapter.java:info(51)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down TimelineReaderServer at atsserver/ip ************************************************************/ I've followed the steps on this workaround and I am getting a new error message https://community.cloudera.com/t5/Support-Questions/ATS-hbase-does-not-seem-to-start/td-p/235155 - guillaume_roger We use to hace kerberized this cluster but it's not anymore a a few months ago. That's why this message is weird 2019-10-25 08:58:31,650 - HdfsResource['/atsv2/hbase/data'] {'security_enabled': False, 'hadoop_bin_dir': '/usr/hdp/3.1.0.0-78/hadoop/bin', 'keytab': '', 'dfs_type': 'HDFS', 'default_fs': 'hdfs://hdpnndev', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': '/usr/bin/kinit', 'principal_name': '', 'user': 'hdfs', 'owner': 'yarnats', 'hadoop_conf_dir': '/usr/hdp/3.1.0.0-78/hadoop/conf', 'type': 'directory', 'action': ['create_on_execute'], 'immutable_paths': [u'/mr-history/done', u'/warehouse/tablespace/managed/hive', u'/warehouse/tablespace/external/hive', u'/app-logs', u'/tmp']}
2019-10-25 08:58:31,652 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -s '"'"'http://masternode:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpZs_d1F 2>/tmp/tmpNf5VJV''] {'quiet': False} Then , it got stuck several minutes on this part 2019-10-25 08:58:31,791 - call returned (0, '')
2019-10-25 08:58:31,792 - get_user_call_output returned (0, u'{\n "beans" : [ {\n "name" : "Hadoop:service=NameNode,name=FSNamesystem",\n "modelerType" : "FSNamesystem",\n "tag.Context" : "dfs",\n "tag.HAState" : "active",\n "tag.TotalSyncTimes" : "103 16 ",\n "tag.Hostname" : "zookeeper",\n "MissingBlocks" : 0,\n "MissingReplOneBlocks" : 0,\n "ExpiredHeartbeats" : 0,\n "TransactionsSinceLastCheckpoint" : 162564,\n "TransactionsSinceLastLogRoll" : 97,\n "LastWrittenTransactionId" : 136802255,\n "LastCheckpointTime" : 1571986118000,\n "CapacityTotal" : 94452535787520,\n "CapacityTotalGB" : 87966.0,\n "CapacityUsed" : 17431687303220,\n "CapacityUsedGB" : 16235.0,\n "CapacityRemaining" : 72211133116924,\n "ProvidedCapacityTotal" : 0,\n "CapacityRemainingGB" : 67252.0,\n "CapacityUsedNonDFS" : 0,\n "TotalLoad" : 128,\n "SnapshottableDirectories" : 1,\n "Snapshots" : 1,\n "NumEncryptionZones" : 0,\n "LockQueueLength" : 0,\n "BlocksTotal" : 985264,\n "NumFilesUnderConstruction" : 32,\n "NumActiveClients" : 21,\n "FilesTotal" : 1560325,\n "PendingReplicationBlocks" : 0,\n "PendingReconstructionBlocks" : 0,\n "UnderReplicatedBlocks" : 20,\n "LowRedundancyBlocks" : 20,\n "CorruptBlocks" : 0,\n "ScheduledReplicationBlocks" : 0,\n "PendingDeletionBlocks" : 597,\n "LowRedundancyReplicatedBlocks" : 20,\n "CorruptReplicatedBlocks" : 0,\n "MissingReplicatedBlocks" : 0,\n "MissingReplicationOneBlocks" : 0,\n "HighestPriorityLowRedundancyReplicatedBlocks" : 0,\n "HighestPriorityLowRedundancyECBlocks" : 0,\n "BytesInFutureReplicatedBlocks" : 0,\n "PendingDeletionReplicatedBlocks" : 597,\n "TotalReplicatedBlocks" : 985264,\n "LowRedundancyECBlockGroups" : 0,\n "CorruptECBlockGroups" : 0,\n "MissingECBlockGroups" : 0,\n "BytesInFutureECBlockGroups" : 0,\n "PendingDeletionECBlocks" : 0,\n "TotalECBlockGroups" : 0,\n "ExcessBlocks" : 0,\n "NumTimedOutPendingReconstructions" : 0,\n "PostponedMisreplicatedBlocks" : 24,\n "PendingDataNodeMessageCount" : 0,\n "MillisSinceLastLoadedEdits" : 0,\n "BlockCapacity" : 2097152,\n "NumLiveDataNodes" : 4,\n "NumDeadDataNodes" : 0,\n "NumDecomLiveDataNodes" : 0,\n "NumDecomDeadDataNodes" : 0,\n "VolumeFailuresTotal" : 0,\n "EstimatedCapacityLostTotal" : 0,\n "NumDecommissioningDataNodes" : 0,\n "StaleDataNodes" : 0,\n "NumStaleStorages" : 48,\n "TotalSyncCount" : 76,\n "NumInMaintenanceLiveDataNodes" : 0,\n "NumInMaintenanceDeadDataNodes" : 0,\n "NumEnteringMaintenanceDataNodes" : 0\n } ]\n}', u'')
2019-10-25 08:58:31,793 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -s '"'"'http://yarnatsserver:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpmLa9e3 2>/tmp/tmpcGNRPR''] {'quiet': False} Finally it finishes with this message 2019-10-25 09:17:53,022 - call returned (143, "-bash: line 1: 28392 Terminated curl -s 'http://yarnatsserver:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem' > /tmp/tmpmLa9e3 2> /tmp/tmpcGNRPR")
2019-10-25 09:17:53,022 - call['hdfs haadmin -ns hdpnndev -getServiceState nn2'] {'logoutput': True, 'user': 'hdfs'}
... View more
Labels:
10-21-2019
09:30 AM
Hello dnavarro. Thank you for asking... Actually my question was because I wasn't getting complete size of hdfs that I have on the output command hdfs -du against the size I was watching on the ambari UI. There was a significant difference on it. At the end, it turned up that that difference was due to that hdp 2.6.5 version command hdfs dfs -du is not taking in count the snapshots sizes, therefore, there we have the missmatching I was looking since the beginning and why I needed that output. Just keep in mind that for version grater than 2.6.5 you need to take snapshots size separately if you want to check entire hdfs size 😉 😉
... View more
09-13-2019
01:55 PM
I am tyring to get this output. Nevertheless some articles points out this is due to a version matter, which I doubt, since I've seen this behavior on multiple environments with different versions is just that I don't know how to enable it.
If someone can point me to the right path I would highly appreciate it.
Regards!
... View more
Labels:
- Labels:
-
Apache HBase
08-07-2019
03:54 PM
1 Kudo
Is there any sort of formula or how did you came up with this value for users's processes? is it a random value? what can I check within my cluster in order to get a proper value for me?
... View more
03-29-2019
10:25 PM
Any one who can help us to get an official site to get his jar to make it work with tibco? Regards!
... View more
Labels:
- Labels:
-
Apache Hadoop
01-07-2019
09:02 PM
In adittion, within my cluster, port was assigned to 9090, by getting the pid from above command I just ran netstat -tuplen | grep <pid> and I got the port where the UI is running
... View more
10-11-2018
09:23 PM
This is a known behavior from Hortonworks, check the discussion https://issues.apache.org/jira/browse/HDFS-8986 Regards!
... View more
10-11-2018
03:34 PM
Hello team.
Current lecture directly to the involved directory is giving us: hadoop fs -du -h -s </path/dir> t
1.4 T </path/dir> of information but this is not accurate because when getting each and one size from the files and directories within above directory, performing a simple addition operation from excel we are getting
386574657917
which translated in GB is 360GB which is the expected size within this directory. Has someone face this before? Regards!
... View more
Labels:
- Labels:
-
Apache Hadoop
09-13-2018
09:17 PM
Also check https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html https://community.hortonworks.com/questions/5780/hive-on-tez-query-map-output-outofmemoryerror-java.html https://community.hortonworks.com/questions/12067/what-is-the-workaround-when-getting-hive-outofmemo.html
... View more
09-13-2018
08:58 PM
What I understand from documentation (I am applying same changes) and from previous experience is that happens that some jobs has too many inputs and outputs to be handle by the system itself (mappers, containers, reducers etc...) this affects from one side your cluster performance since resources are not enough for such jobs to complete or simply they start to present a considerable delay to what is expected. I've seen that's the moment when OOM error shows up. Saying this, I guess explanation from the smartsense is clear since it tells us "Enabling auto-scaling scales down requests, which may otherwise be too large." Hope this helps 😉
... View more
09-13-2018
06:41 PM
Hello team. I've just applied the recommended smartsense setting to make my dfs.namenode.checkpoint.period more efficent and I want to take a measure on how fast is now my namenode recovering from an restart than it was before. Where can I get an old lecture from the nn last restart and how much time did it take? Regards!
... View more
Labels:
- Labels:
-
Apache Hadoop
09-13-2018
06:39 PM
Below the answer in case someone need it: "Each handler (worker) thread consume resources/memory on the NN.
We can not set this value too high because it will consume unnecessary resources and cause extra burden on the NN."
... View more
09-12-2018
03:43 PM
I am following Smartsense recommendation to make my cluster operationally more efficient, however there is a lot of documentation on how to set mentioned feature within ambari an why handlers are important for RPC, no one is mentioning why having this feature set "too high" affects a cluster. Can anyone share some thought here? Regards!"
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Hortonworks SmartSense
09-03-2018
09:56 PM
According to hwx team, this is a known bug https://issues.apache.org/jira/browse/AMBARI-23309https://hortonworks.jira.com/browse/BUG-92535 (Internal Link)
... View more
09-03-2018
06:46 PM
Hello all. I have installed ambari 2.6.0.9 using hdp 2.6.3.7 on my 3 env (dev, qa and pro) and none of the 3 are showing mentioned widget within yarn service - Check attached image. Any idea, I have not found information yet. Regards!!
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache YARN
06-05-2018
05:46 PM
It worked for me. Thanks!
... View more
05-16-2018
11:37 PM
Hey Geoffrey eventhough it worked. I kept monitoring it for a while and metrics went away again, but this time with a different message 2018-05-16 22:53:52,754 INFO TimelineMetricHostAggregatorHourly: End aggregation cycle @ Wed May 16 22:53:52 UTC 2018
2018-05-16 22:54:10,428 INFO org.apache.hadoop.hbase.client.AsyncProcess: #1, waiting for 58080 actions to finish
2018-05-16 22:54:20,432 INFO org.apache.hadoop.hbase.client.AsyncProcess: #1, waiting for 58080 actions to finish
2018-05-16 22:54:30,437 INFO org.apache.hadoop.hbase.client.AsyncProcess: #1, waiting for 58080 actions to finish
2018-05-16 22:54:40,446 INFO org.apache.hadoop.hbase.client.AsyncProcess: #1, waiting for 58080 actions to finish
2018-05-16 22:54:45,499 INFO TimelineClusterAggregatorSecond: Started Timeline aggregator thread @ Wed May 16 22:54:45 UTC 2018
2018-05-16 22:54:45,501 INFO TimelineClusterAggregatorSecond: Last Checkpoint read : Wed May 16 22:52:00 UTC 2018
2018-05-16 22:54:45,501 INFO TimelineClusterAggregatorSecond: Rounded off checkpoint : Wed May 16 22:52:00 UTC 2018
2018-05-16 22:54:45,501 INFO TimelineClusterAggregatorSecond: Last check point time: 1526511120000, lagBy: 165 seconds.
2018-05-16 22:54:45,501 INFO TimelineClusterAggregatorSecond: Start aggregation cycle @ Wed May 16 22:54:45 UTC 2018, startTime = Wed May 16 22:52:00 UTC 2018, endTime = Wed May 16 22:54:00 UTC 2018
2018-05-16 22:54:45,501 INFO TimelineClusterAggregatorSecond: Skipping aggregation for metric patterns : sdisk\_%,boottime
2018-05-16 22:54:50,453 INFO org.apache.hadoop.hbase.client.AsyncProcess: #1, waiting for 58080 actions to finish
2018-05-16 22:55:00,460 INFO org.apache.hadoop.hbase.client.AsyncProcess: #1, waiting for 58080 actions to finish
2018-05-16 22:55:10,462 INFO org.apache.hadoop.hbase.client.AsyncProcess: #1, waiting for 58080 actions to finish
2018-05-16 22:55:20,463 INFO org.apache.hadoop.hbase.client.AsyncProcess: #1, waiting for 58080 actions to finish
2018-05-16 22:55:30,473 INFO org.apache.hadoop.hbase.client.AsyncProcess: #1, waiting for 58080 actions to finish
2018-05-16 22:55:40,476 INFO org.apache.hadoop.hbase.client.AsyncProcess: #1, waiting for 58080 actions to finish
2018-05-16 22:55:50,487 INFO org.apache.hadoop.hbase.client.AsyncProcess: #1, waiting for 58080 actions to finish
2018-05-16 22:56:00,490 INFO org.apache.hadoop.hbase.client.AsyncProcess: #1, waiting for 58080 actions to finish
2018-05-16 22:56:10,494 INFO org.apache.hadoop.hbase.client.AsyncProcess: #1, waiting for 58080 actions to finish
<br> and it's showing above all time. Any idea? I meant, the message is preatty clear and looks like my heap size is not enough for the amount of data the service is getting, this is what I have configure on my metrics collector heap size: metrics_collector_heapsize = 6144 If I have a cluster with 126 node and 106 of them has 899.50 GB as configure capaciy ant 20 of them with 399.75 GB what would be a fair amount of heap size to assign to this service does a formula exists for this? Regards!
... View more
05-16-2018
10:02 PM
Thank you Geoffrey it works and now logs are behaving as expected adn dashboard as well.
... View more
05-16-2018
09:08 PM
Problem: Amabri metrics dashboard within Ambari is not loading information (image attached) Actions: Ambari metrics service restart, Ambari collector restart. ambari-metrics-dashboard.png Also Ive founded this message on the logs. Therefore, I am trying to get mentioned server out from blacklist but I can't find information about it. 2018-05-16 20:32:17,258 [WARNING] emitter.py:146 - Error sending metrics to server.
2018-05-16 20:32:17,258 [WARNING] emitter.py:111 - Retrying after 5 ...
2018-05-16 20:35:42,259 [WARNING] emitter.py:146 - Error sending metrics to server. ''
2018-05-16 20:35:42,260 [WARNING] emitter.py:111 - Retrying after 5 ...
2018-05-16 20:35:47,260 [WARNING] emitter.py:120 - Metric collector host <server_name> was blacklisted.
2018-05-16 20:35:47,260 [INFO] emitter.py:96 - No valid collectors found...
2018-05-16 20:36:47,269 [INFO] emitter.py:96 - No valid collectors found...
2018-05-16 20:37:47,273 [INFO] emitter.py:96 - No valid collectors found...
2018-05-16 20:38:47,276 [INFO] emitter.py:96 - No valid collectors found...
2018-05-16 20:39:47,280 [INFO] emitter.py:96 - No valid collectors found...
2018-05-16 20:40:47,283 [INFO] emitter.py:154 - Calculated collector shard based on hostname : <server_name>
2018-05-16 20:44:07,284 [WARNING] emitter.py:146 - Error sending metrics to server. ''
2018-05-16 20:44:07,285 [WARNING] emitter.py:111 - Retrying after 5 ...
2018-05-16 20:44:12,285 [WARNING] emitter.py:146 - Error sending metrics to server.
2018-05-16 20:44:12,285 [WARNING] emitter.py:111 - Retrying after 5 ...
2018-05-16 20:47:37,286 [WARNING] emitter.py:146 - Error sending metrics to server. ''
2018-05-16 20:47:37,287 [WARNING] emitter.py:111 - Retrying after 5 ...
... View more
Labels:
- Labels:
-
Apache Ambari
04-27-2018
05:14 PM
Which node would be this file located? Or where in Ambari can be searched for?
... View more
04-27-2018
02:51 PM
This is not an issue any more somebody ran hive on safe mode therfore beeline.properties files was created. Check below link. It gave me the root cause, but in order to fix specifically my issue, I needed either to delete the beeline.properties file created, remove below property or comment it out within beeline.properties files located on /home/<user_account>/.beeline/beeline.properties beeline.hiveconfvariables={} https://issues.apache.org/jira/browse/HIVE-16116
... View more
04-24-2018
08:27 PM
I have 2 edge nodes where I can do this, one is working properly and the other is not. I want to know what should I need to check in order to clear posted message when accessing to hive through beeline? This is the one is not working: beeline -n srvc_ima_platform -u 'jdbc:hive2://server:2181,server:2181,server:2181/ea_fin;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;?tez.queue.name=srvc_platform' --showHeader=false --outputformat=tsv2 --hiveconf hive.fetch.task.conversion=none
Exception in thread "main" java.lang.NullPointerException
at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:677)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:777)
at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:491)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:474)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
And this is he one working properly, both cli are exactly the same: beeline -n srvc_ima_platform -u 'jdbc:hive2://server:2181,server:2181,server:2181/ea_fin;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;?tez.queue.name=srvc_platform' --showHeader=false --outputformat=tsv2 --hiveconf hive.fetch.task.conversion=none
Connecting to jdbc:hive2://server:2181,server:2181,server:2181/ea_fin;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;?tez.queue.name=srvc_platform
Connected to: Apache Hive (version 1.2.1000.2.6.0.24-2)
Driver: Hive JDBC (version 1.2.1000.2.6.0.3-8)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1000.2.6.0.3-8 by Apache Hive
0: jdbc:hive2://server:2>
Regards!!
... View more
Labels:
- Labels:
-
Apache Hive
04-19-2018
08:55 PM
We left the settings just as stated and cluster keeps working fine.
... View more
03-24-2018
12:08 AM
HADOOP_NFS3_OPTS="-Xmx{{nfsgateway_heapsize}}m -Dhadoop.security.logger=ERROR,DRFAS ${HADOOP_NFS3_OPTS}"
... View more
Labels:
- Labels:
-
Apache Hadoop
03-23-2018
08:48 PM
1 Kudo
Hello team. I've founded we have some items already set within hdfs-site.xml with different names that the ones indicated on hwx documents. Therefore: 1.- I would like to know the difference between each item. 2.- To know if it will be better to add them within the hdfs-site.xml along the existing settings already there?. I meant duplicate the values but with different names setings. 3.- To remove current setting and set the new ones (the ones indicated on the document). https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_hdfs-nfs-gateway-user-guide/content/user-guide-hdfs-nfs-instructions.html For question 3 I would say this will depend on my organization requierements but I would like to read your ideas. Current = nfs.kerberos.principal Reccommended = dfs.nfsgateway.kerberos.principal Current = nfs.keytab.file Recommended = dfs.nfsgateway.keytab.file Regards!!
... View more
Labels:
02-13-2018
11:28 PM
Actually, clearing cookies and rebooting my browser was enough for me since all previous steps were properly set.
... View more
02-11-2018
06:30 PM
I've seen it depends on the services that are failling. E.g. Below steps worked for me when HDFS or Node Manager giving mentioned message, but when I have all servicies stopped due to this message in a single node it does not work (I am figuring out how to). ambari-agent stop mv /var/lib/ambari-agent/data/structured-out-status.json /var/lib/ambari-agent/data/structured-out-status.json-old ambari-agent start Regards!
... View more
01-26-2018
11:35 PM
Any luck on this? I am also having error 401 and credentials are fine as well.
... View more
12-11-2017
06:28 PM
We are observing mentioned process is consuming high CPU resources for a particular server and going through the clutser looking for same process, I 've founded on other 2 servers, running under a different user and they are not even close to the consumpltion level like the one I am having the doubt with.
What I want to know is if this is a normal behavior and if I should't worry abou it. server_1_[root@server_1 ~]# top -p 16193
top - 18:04:58 up 479 days, 21:34, 2 users, load average: 5.79, 3.53, 2.51
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 20.4 us, 4.0 sy, 0.0 ni, 75.1 id, 0.0 wa, 0.0 hi, 0.5 si, 0.0 st
KiB Mem : 26384640+total, 11807649+free, 26398576 used, 11937132+buff/cache
KiB Swap: 4194556 total, 4194556 free, 0 used. 23283172+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16193 alipiah 20 0 15.861g 2.012g 30100 S 0.7 0.8 14:42.70 java
**********************************************************************************
server_2_[root@server_2 ~]# top -p 9864
top - 18:06:52 up 479 days, 21:36, 2 users, load average: 0.39, 0.47, 0.49
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.8 us, 0.3 sy, 0.0 ni, 98.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 26384640+total, 66057520 free, 26081160 used, 17170771+buff/cache
KiB Swap: 4194556 total, 4194556 free, 0 used. 23298083+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9864 alipiah 20 0 15.857g 438752 30048 S 0.0 0.2 3:12.83 java
**********************************************************************************
server_3_[root@server_3 ~]# top -p 344095
top - 18:08:43 up 479 days, 21:38, 2 users, load average: 38.80, 35.37, 32.46
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 81.3 us, 1.9 sy, 0.0 ni, 16.8 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem : 26384640+total, 19629884 free, 42436532 used, 20177996+buff/cache
KiB Swap: 4194556 total, 4194556 free, 0 used. 21663824+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
344095 root 20 0 29.954g 8.888g 23968 S 2129 3.5 82702:17 java
... View more
Labels:
- Labels:
-
Apache Pig
-
Hortonworks SmartSense