About pbaclace

pbaclace · ‎07-16-2024

@GangWar @wert_1311 I have found HDFS files that are persistently under-replicated, despite being over a year old. They are rare, but vulnerable to loss with one disk failure. To be clear, this shows the replication target, not the actual: hdfs dfs -ls filename The actual can be found with 'hdfs fsck filename -blocks -files filename' In theory, this situation should be transient, but I have found some cases. See example below where a file is 3 blocks in length and one of them only has one replica. # hdfs fsck -blocks -files /tmp/part-m-03752 OUTPUT: /tmp/part-m-03752: Under replicated BP-955733439-1.2.3.4-1395362440665:blk_1967769468_1100461809792. Target Replicas is 3 but found 1 live replica(s), 0 decommissioned replica(s), 0 decommissioning replica(s). /tmp/part-m-03752: Replica placement policy is violated for BP-955733439-1.2.3.4-1395362440665:blk_1967769468_1100461809792. Block should be additionally replicated on 1 more rack(s). 0. BP-955733439-1.2.3.4-1395362440665:blk_1967769089_1100461809406 len=134217728 Live_repl=3 1. BP-955733439-1.2.3.4-1395362440665:blk_1967769276_1100461809593 len=134217728 Live_repl=3 2. BP-955733439-1.2.3.4-1395362440665:blk_1967769468_1100461809792 len=40324081 Live_repl=1 Status: HEALTHY Total size: 308759537 B Total dirs: 0 Total files: 1 Total symlinks: 0 Total blocks (validated): 3 (avg. block size 102919845 B) Minimally replicated blocks: 3 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 1 (33.333332 %) Mis-replicated blocks: 1 (33.333332 %) Default replication factor: 3 Average block replication: 2.3333333 Corrupt blocks: 0 Missing replicas: 2 (22.222221 %) Number of data-nodes: 30 Number of racks: 3 The filesystem under path '/tmp/part-m-03752' is HEALTHY # hadoop fs -ls /tmp/part-m-03752 OUTPUT: -rw-r--r-- 3 wuser hadoop 308759537 2021-12-11 16:58 /tmp/part-m-03752 [sorry, code quoting is not working for me for some reason.] Presumably, the file was incorrectly replicated when it was written because of some failure and the defaults for dfs.client.block.write.replace-datanode-on-failure props were such that new DNs were not obtained at write time to replace ones that failed. The puzzling thing here is why does it not get re-replicated after all this time?

pbaclace · ‎04-16-2021

Update: I moved SM to a host that has an typical load of 7-8 instead of 24. After a day on the new machine, there have been no alerts generated about SM being slow and no gaps in charts. Conclusion: The problem was SM works best on a machine with low load.

pbaclace · ‎04-14-2021

Update: The load went down to a reasonable level (24), so cpu starvation is not happening, but Service Monitor is still losing data from time to time with 5-30min gaps. The disk it is using is striped RAID and not used by YARN, so I don't think the issue can be disk performance.

pbaclace · ‎04-13-2021

Some more info: I see WARNs like: JvmPauseMonitor: Detected pause in JVM or host machine (e.g. a stop the world GC, or JVM not scheduled): paused approximately 28577ms but gcutil is: S0 S1 E O M CCS YGC YGCT FGC FGCT GCT 0.00 63.51 80.24 7.41 97.94 94.86 5073 347.717 6 1.950 349.668 which shows old gen is only 7.41% used, so it is not out of heap. That means "JVM not scheduled" must be the condition.

pbaclace · ‎04-13-2021

I am seeing frequent Cloudera Manager Service Monitor outages: SERVICE_MONITOR_PAUSE_DURATION has become bad: Average time spent paused was 39.5 second(s) (65.76%) per minute over the previous 5 minute(s). despite increasing the heap size to 7g and the 'off-heap' size to 24g. The machine often sees a high load (a NodeManager is also on the same machine), like 90 on a 24 core machine, so I suspect it might be starved of cpu when doing aggregation. The process regularly has +700 files open. I have motivation to fix this issue since it causes data loss in the time series because SM pulls data and misses data points for +15 minutes at times. The WARN: AggregatingTimeSeriesStore: run duration exceeded desired period is correlated with the above. Is there a documented procedure to move Service Monitor to another machine while keeping existing data? Perhaps like: 0. stop SM to quiesce changes to /var/lib/cloudera-service-monitor/ts/ 1. using CM, redefine SM on another host 2. move /var/lib/cloudera-service-monitor/ts/ contents before starting SM 3. start SM SM uses LevelDB, but I don't know the internals of that and whether /var/lib/cloudera-service-monitor/ts/ can just be moved. I don't want to lose the 1 month of history I have.

Online	Offline
Last Visited	‎11-02-2024 02:35 PM

Member Since	‎12-02-2014 01:41 PM
Last Visited	‎11-02-2024 02:35 PM
Posts	8
Kudos received	1

Cloudera Community

Re: "SERVICE_MONITOR_PAUSE_DURATION has become bad...

Re: Check replication factor for a directory in hd...

Re: "SERVICE_MONITOR_PAUSE_DURATION has become bad...

Re: "SERVICE_MONITOR_PAUSE_DURATION has become bad...

Re: "SERVICE_MONITOR_PAUSE_DURATION has become bad...

"SERVICE_MONITOR_PAUSE_DURATION has become bad " d...