Member since
06-16-2016
15
Posts
2
Kudos Received
0
Solutions
05-11-2020
01:02 AM
2020-05-11 15:43:16,239 ERROR [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container launch failed for container_e88_1589171925074_0626_01_000002 : java.net.SocketTimeoutException: Call From wo3pfhadl12w/10.11.105.123 to wo3pfhadl34w.woo.sing..com:45454 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.11.105.123:40884 remote=wo3pfhadl34w.woo.sing..com/10.11.105.149:45454]; For more details see: http://wiki.apache.org/hadoop/SocketTimeout at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:775) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1501) at org.apache.hadoop.ipc.Client.call(Client.java:1443) at org.apache.hadoop.ipc.Client.call(Client.java:1353) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) at com.sun.proxy.$Proxy85.startContainers(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:128) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) at com.sun.proxy.$Proxy86.startContainers(Unknown Source) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:160) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:394) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.11.105.123:40884 remote=wo3pfhadl34w.woo.sing.com/10.11.105.149:45454] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.FilterInputStream.read(FilterInputStream.java:133) at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at java.io.DataInputStream.read(DataInputStream.java:149) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read(BufferedInputStream.java:265) at java.io.FilterInputStream.read(FilterInputStream.java:83) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:554) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1802) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1167) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1063)
... View more
05-04-2017
05:42 PM
@Ahmed ELJAMI I see that your blueprint does not have such a property jdbc_jar_name. Can you please share the ambari-server logs as well? What is your ambari version? Can you also try "hive-env": { "hive_database": "Existing MySQL Database", replace with -> "Existing MySQL / MariaDB Database" Let me know if it works.
... View more
04-05-2017
04:54 PM
Hi Ahmed, I think there is an incorrect configuration attribute called jdbc_jar_name. You can try removing it and submitting the blueprint once more. If your blueprint isnt too large, maybe you can share it here.
... View more
11-15-2016
10:17 AM
Hi, We activated cgroup using Ambari.
When I run a hive job, I got 2 nodes of my cluster down and I saw some errors
in the linux kernel log (see above). HDP version: 2.3.4 Kernel
version: 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016 x86_64
x86_64 x86_64 GNU/Linux Any idea please ? Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.475150] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.475409] IP: [<ffffffff813700d1>] rb_next+0x1/0x50
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.475574] PGD 0
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.475647] Oops: 0000 [#1] SMP
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.475757] Modules linked in: udp_diag tcp_diag inet_diag x86_pkg_temp_thermal intel_powerclamp 8021q garp iomemory_vsl(POX) stp mrp llc coretemp kvm crct10dif_pclmul gpio_ich crc32_pclmul mei_me mei sb_edac joydev edac_core shpchp wmi dcdbas aesni_intel acpi_power_meter aes_x86_64 lrw gf128mul glue_helper ablk_helper lpc_ich cryptd ipmi_watchdog ipmi_poweroff mac_hid ipmi_devintf ipmi_si hid_generic usbhid hid igb ixgbe i2c_algo_bit dca ptp megaraid_sas pps_core mdio
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.477273] CPU: 12 PID: 0 Comm: swapper/12 Tainted: P OX 3.13.0-96-generic #143-Ubuntu
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.477559] Hardware name: Dell Inc. PowerEdge R720xd/0HJK12, BIOS 2.4.3 07/09/2014
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.477811] task: ffff881003e81800 ti: ffff880803c0c000 task.ti: ffff880803c0c000
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.478057] RIP: 0010:[<ffffffff813700d1>] [<ffffffff813700d1>] rb_next+0x1/0x50
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.478304] RSP: 0018:ffff880803c0de20 EFLAGS: 00010046
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.478479] RAX: 0000000000000000 RBX: ffff880036b3ec00 RCX: 0000000000000cd2
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.478712] RDX: 0000000002cc7842 RSI: ffff880036b3d000 RDI: 0000000000000010
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.478947] RBP: ffff880803c0de68 R08: ffff8807d4cabe00 R09: 0000000000000018
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.479181] R10: 0000000000000415 R11: 00000000000004f8 R12: 0000000000000000
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.479416] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000b71b00
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.479651] FS: 0000000000000000(0000) GS:ffff88080fac0000(0000) knlGS:0000000000000000
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.479916] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.480106] CR2: 0000000000000010 CR3: 0000000001c0e000 CR4: 00000000001407e0
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.480338] Stack:
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.480404] ffff880803c0de68 ffffffff810a2cc2 000000000000d160 ffff88080fad3180
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.480661] ffff881003e81c30 ffff88080fad3180 000000000000000c 0000000000000000
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.480913] ffff880803c0dfd8 ffff880803c0dec8 ffffffff8172dd22 ffff881003e81800
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.481167] Call Trace:
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.481250] [<ffffffff810a2cc2>] ? pick_next_task_fair+0x102/0x1b0
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.481453] [<ffffffff8172dd22>] __schedule+0x142/0x7f0
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.481624] [<ffffffff8172e909>] schedule_preempt_disabled+0x29/0x70
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.481830] [<ffffffff810c1d88>] cpu_startup_entry+0x268/0x2b0
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.482015] [<ffffffff8104278d>] start_secondary+0x21d/0x2d0
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.482192] Code: e5 48 85 c0 75 07 eb 19 66 90 48 89 d0 48 8b 50 10 48 85 d2 75 f4 48 8b 50 08 48 85 d2 75 eb 5d c3 31 c0 5d c3 0f 1f 44 00 00 55 <48> 8b 17 48 89 e5 48 39 d7 74 3b 48 8b 47 08 48 85 c0 75 0e eb
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.483043] RIP [<ffffffff813700d1>] rb_next+0x1/0x50
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.483204] RSP <ffff880803c0de20>
Nov 14 09:44:10 node002.cassandra.hdp kernel: [318591.483318] CR2: 0000000000000010
... View more
Labels:
10-03-2016
03:26 PM
Ok 🙂 why when I change the default engine to MR in HDP, I don't see anymore the 3rd application (TEZ) ? Oozie container with TEZ is created only when the default engine is TEZ
even if I do not use it
? Thx
... View more
06-16-2016
09:12 AM
@Sagar Shimpi Errors in ambari-server.log: 2016-06-08 04:28:33,369 [CRITICAL] [YARN] [yarn_nodemanager_webui] (NodeManager Web UI) Connection failed to http://node8.mapreduce:8042 (timed out)
2016-06-08 04:28:33,371 [CRITICAL] [YARN] [yarn_nodemanager_health] (NodeManager Health) Connection failed to http://node8.mapreduce:8042/ws/v1/node/info (Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanager_health.py", line 165, in execute
url_response = urllib2.urlopen(query, timeout=connection_timeout)
File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 404, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 422, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1214, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1187, in do_open
r = h.getresponse(buffering=True)
File "/usr/lib/python2.7/httplib.py", line 1051, in getresponse
response.begin()
File "/usr/lib/python2.7/httplib.py", line 415, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.7/httplib.py", line 371, in _read_status
line = self.fp.readline(_MAXLINE + 1)
File "/usr/lib/python2.7/socket.py", line 476, in readline
data = self._sock.recv(self._rbufsize)
timeout: timed out
)
2016-06-08 04:29:23,922 [OK] [YARN] [yarn_nodemanager_webui] (NodeManager Web UI) HTTP 200 response in 0.002s
2016-06-08 04:29:23,923 [OK] [YARN] [yarn_nodemanager_health] (NodeManager Health) NodeManager Healthy
... View more