Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

system auto reboot When MR runs

system auto reboot When MR runs

Explorer

Hi

I'm using HDP2.3.4.7 with cgroup enabled on Redhat7.1.

The Hadoop version is 2.7.1.

When I run large MR jobs, some nodemanager machine auto reboot.

If I use DefaultLCEResourcesHandler instead of CgroupsLCEResourcesHandler,

The MR jobs run fine. /var/crash/127.0.0.1-2016.04.23-21:52:08/vmcore-dmesg.txt like this:

CPU: 29 PID: 63957 Comm: java Not tainted 3.10.0-229.el7.x86_64 #1

...

...

[15770.097168] Call Trace:

[15770.097536] [<ffffffff810afe39>] ? pick_next_task_fair+0x129/0x1d0

[15770.097905] [<ffffffff81608b97>] __schedule+0x127/0x7c0

[15770.098271] [<ffffffff81609259>] schedule+0x29/0x70

[15770.098633] [<ffffffff810d2293>] futex_wait_queue_me+0xd3/0x130

[15770.098992] [<ffffffff810d2e09>] futex_wait+0x179/0x280

[15770.099353] [<ffffffff8101b983>] ? native_sched_clock+0x13/0x80

[15770.099698] [<ffffffff8101b9f9>] ? sched_clock+0x9/0x10

[15770.100057] [<ffffffff810addfe>] ? sched_slice.isra.51+0x5e/0xc0

[15770.100419] [<ffffffff810ad7b8>] ? __enqueue_entity+0x78/0x80

[15770.100783] [<ffffffff810d4e9e>] do_futex+0xfe/0x5b0

[15770.101143] [<ffffffff810a8f44>] ? wake_up_new_task+0x104/0x160

[15770.101496] [<ffffffff810d53d0>] SyS_futex+0x80/0x180

[15770.101852] [<ffffffff81613da9>] system_call_fastpath+0x16/0x1b

Any suggestion will be appreciated. Thanks

4 REPLIES 4
Highlighted

Re: system auto reboot When MR runs

Super Guru

@ww sun

Can you please check if you have disabled THP?

This looks like a Kernel bug.

Highlighted

Re: system auto reboot When MR runs

Explorer

@Kuldeep Kulkarni

Yes, I have manually disabled THP. Do I need enable THP?

Thanks

cat /sys/kernel/mm/transparent_hugepage/defrag

always madvise [never]

cat /sys/kernel/mm/transparent_hugepage/enabled

always madvise [never]

Highlighted

Re: system auto reboot When MR runs

Explorer
Highlighted

Re: system auto reboot When MR runs

@ww sun

For the sake of the community, have you identified the cause? Kuldeep advised that it was a kernel bug:

https://bugs.centos.org/print_bug_page.php?bug_id=7770

Can you accept a best response?

Don't have an account?
Coming from Hortonworks? Activate your account here