Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Increase open file limit of the user to scale for large data processing. ulimit and nofile

Solved Go to solution

Increase open file limit of the user to scale for large data processing. ulimit and nofile

New Contributor

Increase open file limit of the user to scale for large data processing : hive, hbase, hdfs, oozie, yarn, mapred, Zookeeper, Spark, HCat

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Increase open file limit of the user to scale for large data processing. ulimit and nofile

New Contributor

Here is the solution...

1. Services - Hive, HBase, HDFS, Oozie, YARN, MapReduce, Ambari Metrics

These Services we can directly change the file limit from Ambari UI.

Ambari UI > ServiceConfigs> <username of the service>_user_nofile_limit
Example: 1. Ambari UI -> HIVE -> Configs -> Advanced -> Advanced hive-env -> hive_user_nofile_limit  64000
         2. Ambari UI > Ambari Metrics > configs > Advanced ams-hbase-env > max_open_files_limit  64000
         3. Ambari UI > Yarn > configs > Advanced yarn-env > yarn_user_nofile_limit  64000
         4. Ambari UI > MAPREDUCE2 > configs > Advanced mapred-env > mapred_user_nofile_limit  64000


2. Services - Zookeeper, Spark, WebHCat, Ranger . Users - zookeeper, Spark, hcat, ranger

For users spark, hcat, zookeeper, ranger. Add the below lines for their respective nodes in /etc/security/limits.conf

/etc/security/limits.conf file should have below entries.

zookeeper  -    nofile    64000 
spark      -    nofile    64000
hcat       -    nofile    64000
ranger     -    nofile    64000

After save the changes. Login as spark/hcat/zookeeper user and execute ulimit -a command.

check the output. The output should contain value as open files (-n) 64000

Please find the below ulimit -a output .

[spark@node01]$ ulimit -a 
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 513179
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 64000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 64000
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

If you still see ulimit -a values not updated. Then please add the below lines to file /etc/pam.d/su .

vim /etc/pam.d/su
session         required        pam_limits.so

Repeat the above process... It will be successful.

2 REPLIES 2

Re: Increase open file limit of the user to scale for large data processing. ulimit and nofile

New Contributor

Here is the solution...

1. Services - Hive, HBase, HDFS, Oozie, YARN, MapReduce, Ambari Metrics

These Services we can directly change the file limit from Ambari UI.

Ambari UI > ServiceConfigs> <username of the service>_user_nofile_limit
Example: 1. Ambari UI -> HIVE -> Configs -> Advanced -> Advanced hive-env -> hive_user_nofile_limit  64000
         2. Ambari UI > Ambari Metrics > configs > Advanced ams-hbase-env > max_open_files_limit  64000
         3. Ambari UI > Yarn > configs > Advanced yarn-env > yarn_user_nofile_limit  64000
         4. Ambari UI > MAPREDUCE2 > configs > Advanced mapred-env > mapred_user_nofile_limit  64000


2. Services - Zookeeper, Spark, WebHCat, Ranger . Users - zookeeper, Spark, hcat, ranger

For users spark, hcat, zookeeper, ranger. Add the below lines for their respective nodes in /etc/security/limits.conf

/etc/security/limits.conf file should have below entries.

zookeeper  -    nofile    64000 
spark      -    nofile    64000
hcat       -    nofile    64000
ranger     -    nofile    64000

After save the changes. Login as spark/hcat/zookeeper user and execute ulimit -a command.

check the output. The output should contain value as open files (-n) 64000

Please find the below ulimit -a output .

[spark@node01]$ ulimit -a 
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 513179
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 64000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 64000
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

If you still see ulimit -a values not updated. Then please add the below lines to file /etc/pam.d/su .

vim /etc/pam.d/su
session         required        pam_limits.so

Repeat the above process... It will be successful.

Highlighted

Re: Increase open file limit of the user to scale for large data processing. ulimit and nofile

Rising Star

Is there any sort of formula or how did you came up with this value for users's processes? is it a random value? what can I check within my cluster in order to get a proper value for me?