Member since
11-23-2016
8
Posts
0
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3863 | 01-29-2019 12:31 AM | |
1464 | 06-27-2017 08:55 AM | |
1786 | 06-27-2017 01:54 AM |
01-29-2019
12:31 AM
We solved the issue. Looks like it is a ulimit related problem. We raised user limits under /etc/security/limits.d/ And then we created a file under /etc/systemd/system/cloudera-scm-agent.service.d/override.conf To override service-level limits And we raised the value echo "65536" > /sys/fs/cgroup/pids/system.slice/cloudera-scm-agent.service/pids.max (instead of rebooting).
... View more
01-25-2019
08:52 AM
I've installed CDH-5.16.1-1.cdh5.16.1.p0.3 on SLES 12.3 in "single user mode". I have some services running cloucera-scm-server cloudera-scm-agent Cloudera Management Service (Alert Publisher, Event Server, Host Monitor, Service Monitor) Zookeeper When I try to start HDFS, Secondary Name Node and Data Node seems ok, but Name Node fails with this error: 2019-01-25 17:21:52,016 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at org.apache.hadoop.ipc.Server.start(Server.java:2696)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.start(NameNodeRpcServer.java:448)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:713)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:844)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:823)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) I immediately thought it was ulimits related issues, so I modified it. cloudera-scm soft nofile 32768 cloudera-scm hard nofile 1048576 cloudera-scm soft nproc 127812 cloudera-scm hard nproc unlimited cloudera-scm soft memlock unlimited cloudera-scm hard memlock unlimited Nothing changed. Someone suggested me to change servicemd related limits, we raised service-related limits and last run has very high value, but Name Node continues to fail > ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 127812
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024680
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 1024360
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited # prlimit -p 100642
RESOURCE DESCRIPTION SOFT HARD UNITS
AS address space limit unlimited unlimited bytes
CORE max core file size 0 unlimited bytes
CPU CPU time unlimited unlimited seconds
DATA max data size unlimited unlimited bytes
FSIZE max file size unlimited unlimited bytes
LOCKS max number of file locks held unlimited unlimited locks
MEMLOCK max locked-in-memory address space unlimited unlimited bytes
MSGQUEUE max bytes in POSIX mqueues 819200 819200 bytes
NICE max nice prio allowed to raise 0 0
NOFILE max number of open files 1024680 1024680 files
NPROC max number of processes 1024360 unlimited processes
RSS max resident set size unlimited unlimited bytes
RTPRIO max real-time priority 0 0
RTTIME timeout for real-time tasks unlimited unlimited microsecs
SIGPENDING max number of pending signals 127812 127812 signals
STACK max stack size 8388608 unlimited bytes I also raised Name Node log level to trace, but logs are pretty clean 2019-01-25 17:21:52,016 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 21 on 8020: starting
2019-01-25 17:21:52,016 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 22 on 8020: starting
2019-01-25 17:21:52,016 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at org.apache.hadoop.ipc.Server.start(Server.java:2696)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.start(NameNodeRpcServer.java:448)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:713)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:844)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:823)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
2019-01-25 17:21:52,017 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 20 on 8020: starting
2019-01-25 17:21:52,019 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 Any suggestion? Thanks SUSE Linux Enterprise Server 12 (x86_64)
VERSION = 12
PATCHLEVEL = 3
# This file is deprecated and will be removed in a future service pack or release.
# Please check /etc/os-release for details about this release.
NAME="SLES"
VERSION="12-SP3"
VERSION_ID="12.3"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP3"
ID="sles"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:12:sp3" Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 10
On-line CPU(s) list: 0-9
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 10
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 37
Model name: Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
Stepping: 1
CPU MHz: 2300.000
BogoMIPS: 4600.00
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 15360K
NUMA node0 CPU(s): 0-4
NUMA node1 CPU(s): 5-9
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc eagerfpu pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes hypervisor lahf_lm arat retpoline kaiser tsc_adjust
... View more
Labels:
- Labels:
-
Cloudera Manager
-
HDFS
06-27-2017
08:55 AM
We engaged vendor professional support and we discovered that:
1- There is an internal thread on HBase REST API documentation that needs enhancement
2- It is possible to use this endpoint (but is not safe if you not have the control of the client that call the service) and, because it can cause OOM, is better/mandatory to use kerberos auth for Hbase REST API
... View more
06-27-2017
01:54 AM
We engaged vendor professional support and we discovered that: 1- There is an internal thread on HBase REST API documentation that needs enhancement 2- It is possible to use this endpoint (but is not safe if you not have the control of the client that call the service) and, because it can cause OOM, is better/mandatory to use kerberos auth for Hbase REST API
... View more
06-13-2017
09:17 AM
Hi, In our cluster we would like to retrive rows from Hbase in the simpliest available way using a REST API. To retrive multiple rows with a REST call (GET) we can use the "Globbing Rows" REST api documented here -> https://hbase.apache.org/1.2/book.html#_rest in this way: http://hbase_url:hbase_port/namespace:table/keywithwildcard*/[...] We discovered that there is an undocumented behaviour. We can call the globbing REST api using startrow,endrow parameter in this way: http://hbase_url:hbase_port/namespace:table/startrow,endorow/[...] In this case it works exaclty as a scan (and it breaks the rest server if you try to retreive too many rows). We also searched in this hbase souce code mirror (https://github.com/apache/hbase/blob/master/hbase-rest/src/main/java/org/apache/hadoop/hbase/rest/Ro...) and noticed that the code is intended to work like this (function private int parseRowKeys(final String path, int i) line 65-110). Why is this behaviour not documented? Is it safe to use (with restriction) this REST call instead of use a scanner or the absence of documentation means that the function will change without any comunication (not even the "deprecated" stuff)?
... View more
Labels:
- Labels:
-
Apache HBase
05-31-2017
03:51 AM
Hi, In our cluster (CM 5.8.4 parcel 5.8.4-1.cdh5.8.4.p0.5, HBase 1.2.0-cdh5.8.4) we would like to retrive rows from Hbase in the simpliest available way using a REST API. To retrive multiple rows with a REST call (GET) we can use the "Globbing Rows" REST api documented here -> https://hbase.apache.org/1.2/book.html#_rest in this way: http://hbase_url:hbase_port/namespace:table/keywithwildcard*/[...] (example from documentation: http://example.com:8000/urls/https|ad.doubleclick.net|*) We discovered that there is an undocumented behaviour. We can call the globbing REST api using startrow,endrow parameter in this way: http://hbase_url:hbase_port/namespace:table/startrow,endorow/[...] In this case it works exaclty as a scan (and it breaks the rest server if you try to retreive too many rows). We also searched in this hbase souce code mirror (https://github.com/apache/hbase/blob/master/hbase-rest/src/main/java/org/apache/hadoop/hbase/rest/RowSpec.java) and noticed that the code is intended to work like this (function private int parseRowKeys(final String path, int i) line 65-110). Why is this behaviour not documented? Is it safe to use (with restriction) this REST call instead of use a scanner or the absence of documentation means that the function will change without any comunication (not even the "deprecated" stuff)?
... View more
Labels:
- Labels:
-
Apache HBase