- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Huge log files: Exception while processing show databases caused by Too many open files
- Labels:
-
Apache Hive
Created ‎07-26-2016 08:38 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am using HDP-2.4.0.0-169 on Ubuntu 14.04 and I am experiencing daily 60G log files from hiveserver2. Before the error, logs files were 2M only.
The error that appears is:
2016-07-26 09:29:12,972 ERROR [HiveServer2-Handler-Pool: Thread-56]: exec.DDLTask (DDLTask.java:failed(525)) - org.apache.hadoop.hive.ql.metadata.HiveException: Exception while processing show databases at org.apache.hadoop.hive.ql.exec.DDLTask.showDatabases(DDLTask.java:2277) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:390) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1720) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1477) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1254) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1118) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1113) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:183) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:419) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:400) at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) at com.sun.proxy.$Proxy20.executeStatement(Unknown Source) at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:261) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1317) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1302) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.FileNotFoundException: /tmp/hive/716ecb33-5f8b-4787-baee-7e369e56d006/hive_2016-07-26_09-29-12_951_5521987407554595007-2/-local-10000 (Too many open files) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.<init>(FileOutputStream.java:213) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:222) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209) at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:305) at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:293) at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:326) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:393) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:776) at org.apache.hadoop.hive.ql.exec.DDLTask.showDatabases(DDLTask.java:2271) ... 35 more
I am troubleshooting but I cannot find the root cause yet. Can you please advice how to solve this issue?
Thank you.
Created ‎07-28-2016 02:20 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You need to increase your OS ulimit. Most likely you have some tables with multiple partitions and processes that access them. You will need to restart your servers and change the ulimit on all nodes. This requires downtime. It is a good practice to do it upfront estimating how the cluster will be used in regard to file descriptors.
Also section 1.2.8 here.
I cannot tell you what is the magic number for you, it depends on what you do and what the servers can provide as resources, but I have seen ulimit being set from tens of thousands to hundreds of thousands. The minimum requirement for installing Hortonworks Data Platform is 10,000. Try various numbers.
If this response helps, please vote/accept it as the best answer.
Created ‎07-28-2016 02:20 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You need to increase your OS ulimit. Most likely you have some tables with multiple partitions and processes that access them. You will need to restart your servers and change the ulimit on all nodes. This requires downtime. It is a good practice to do it upfront estimating how the cluster will be used in regard to file descriptors.
Also section 1.2.8 here.
I cannot tell you what is the magic number for you, it depends on what you do and what the servers can provide as resources, but I have seen ulimit being set from tens of thousands to hundreds of thousands. The minimum requirement for installing Hortonworks Data Platform is 10,000. Try various numbers.
If this response helps, please vote/accept it as the best answer.
