Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive saves occaicionally to 000000_1000 - why?

Highlighted

Hive saves occaicionally to 000000_1000 - why?

Contributor

Hi folks,

I am saving a hive query to a file - usually the output is 000000_0. I left a coordinator running which performed the task several times. At some point it crashed because the file was written to 000000_1000. Why is that? Of course, the next action looking for the file is killed because it doesn't exists.

I cannot explain the behaivour. Does it coincide with a map reduce problem?

How can I prevent it from happening?

Thanks for your help!

Ken

11 REPLIES 11

Re: Hive saves occaicionally to 000000_1000 - why?

Guru

These files are each parts of the parallel processing output on hadoop. E.g if it starts with part_m it results from a mapper and part_r it results from a reducer. Each file part is given a _serialNumber. You can process the results by referencing the path of the parent directory which will union all file parts into one data set.

I am curious if you happened to crash when it hit _1000 or whether there is a configuration for the number of file parts that can be created. @all Anyone want to weigh in on that?

Re: Hive saves occaicionally to 000000_1000 - why?

Super Guru

how are you running Hive? ORC? Tez? what data? table DDL?

Re: Hive saves occaicionally to 000000_1000 - why?

Super Guru

http://stackoverflow.com/questions/8536066/hive-create-table-filename-000000-0

Are there any error messages?

Are you getting the results you want?

Re: Hive saves occaicionally to 000000_1000 - why?

Contributor

Hi and thanks for your responses. The action runs smoothly and is NOT killed.

The query is also a simple Hive select which should not need a huge amount of mappers and reducers.

What's your opinion on this? Any guidance for what i can look out in the logfiles?

Thanks!

Re: Hive saves occaicionally to 000000_1000 - why?

Guru

Post your log files (yarn, hive) for the job and some other details (size of data, file format, size of cluster) and we will all figure this out together.

Re: Hive saves occaicionally to 000000_1000 - why?

Contributor

Thanks Greg. I will try to reproduce the error and post additional information here!

Re: Hive saves occaicionally to 000000_1000 - why?

Contributor

Could it be associated with the parameter: hive.exec.max.dynamic.partitions ??

Re: Hive saves occaicionally to 000000_1000 - why?

Guru

Could be ... the default value is 1000. Taking a look at the logs would be a step closer to proving this.

Re: Hive saves occaicionally to 000000_1000 - why?

Contributor

Hi Greg, I peeked into the logs of three jobs regarding this problem, namley the oozie launcher, the hive job itself and 'insert overwrite directory'. I searched for errors as well as the value 1000 and came up with following lines. Does it help to find the problem somehow?? Or any suggestion what I can search for?

Thanks and kind regards,

Ken

Oozie Launcher for Hive Job

Oozie Launcher ends

60715 [main] -5p org.apache.hadoop.mapred.Task  - Task:attempt_1481914266056_2012_m_000000_1000 is done. And is in the process of committing
60758 [main] -5p org.apache.hadoop.mapred.Task  - Task attempt_1481914266056_2012_m_000000_1000 is allowed to commit now
60764 [main] -5p org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter  - Saved output of task 'attempt_1481914266056_2012_m_000000_1000' to hdfs://xxxxxx
60809 [main] -5p org.apache.hadoop.mapred.Task  - Task 'attempt_1481914266056_2012_m_000000_1000' done.





Hive configuration Properties
-----------------------------
hive.txn.max.open.batch=1000
dfs.namenode.inotify.max.events.per.rpc=1000
mapreduce.reduce.merge.inmem.threshold=1000
mapreduce.client.progressmonitor.pollinterval=1000
mapreduce.jobtracker.retiredjobs.cache.size=1000
hive.compactor.abortedtxn.threshold=1000

Hive Job

- no special occurrences of 1000 and errors -

Inser Overwrite Directory Job

Logtype:syslog
2016-12-19 13:18:00,109 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1481914266056_2014_m_000002_1000 is done. And is in the process of committing
2016-12-19 13:18:00,198 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1481914266056_2014_m_000002_1000' done.


2016-12-19 13:17:50,619 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1481914266056_2014_m_000000_1000 TaskAttempt Transitioned from NEW to UNASSIGNED
2016-12-19 13:17:50,620 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1481914266056_2014_m_000001_1000 TaskAttempt Transitioned from NEW to UNASSIGNED
2016-12-19 13:17:50,620 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1481914266056_2014_m_000002_1000 TaskAttempt Transitioned from NEW to UNASSIGNED
2016-12-19 13:17:50,620 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1481914266056_2014_r_000000_1000 TaskAttempt Transitioned from NEW to UNASSIGNED

2016-12-19 13:18:11,184 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: Final Path: FS hdfs://XXX/.hive-staging_hive_2016-12-19_13-17-41_137_2262959442612945763-1/_tmp.-ext-10002/000000_1000
2016-12-19 13:18:11,184 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS hdfs://XXX/.hive-staging_hive_2016-12-19_13-17-41_137_2262959442612945763-1/_task_tmp.-ext-10002/_tmp.000000_1000
2016-12-19 13:18:11,184 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS hdfs://XXX/.hive-staging_hive_2016-12-19_13-17-41_137_2262959442612945763-1/_tmp.-ext-10002/000000_1000
2016-12-19 13:18:11,194 WARN [main] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server :
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby
        at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:375)
        at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:595)
        at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:397)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:762)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:758)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:757)
        at org.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1618)
        at org.apache.hadoop.ipc.Client.call(Client.java:1449)
        at org.apache.hadoop.ipc.Client.call(Client.java:1396)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:816)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
        at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2158)
        at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1423)
        at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1419)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1419)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1443)
        at org.apache.hadoop.fs.FileSystem.deleteOnExit(FileSystem.java:1398)
        at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:619)
        at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:570)
        at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:679)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
        at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
        at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
2016-12-19 13:18:11,296 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[1]: records written - 1
2016-12-19 13:18:11,297 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[1]: records written - 4
2016-12-19 13:18:11,446 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: RECORDS_OUT_1:4,
2016-12-19 13:18:11,447 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1481914266056_2014_r_000000_1000 is done. And is in the process of committing
2016-12-19 13:18:11,520 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1481914266056_2014_r_000000_1000' done.

End of LogType:syslog

LogType:syslog.shuffle
Log Upload Time:Mon Dec 19 13:18:18 +0100 2016
LogLength:6454
Log Contents:
2016-12-19 13:18:10,616 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: MergerManager: memoryLimit=12180048896, maxSingleShuffleLimit=3045012224, mergeThreshold=8038832640, ioSortFactor=100, memToMemMergeOutputsThreshold=100
2016-12-19 13:18:10,618 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1481914266056_2014_r_000000_1000 Thread started: EventFetcher for fetching Map Completion Events
2016-12-19 13:18:10,624 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Ignoring obsolete output of FAILED map-task: 'attempt_1481914266056_2014_m_000001_1000'
2016-12-19 13:18:10,624 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Ignoring obsolete output of FAILED map-task: 'attempt_1481914266056_2014_m_000000_1000'
2016-12-19 13:18:10,625 INFO [fetcher#7] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning XXX with 1 to fetcher#7
2016-12-19 13:18:10,626 INFO [fetcher#7] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to CCC to fetcher#7
2016-12-19 13:18:10,626 INFO [fetcher#11] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning XXX with 1 to fetcher#11
2016-12-19 13:18:10,626 INFO [fetcher#11] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to XXX to fetcher#11
2016-12-19 13:18:10,626 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: attempt_1481914266056_2014_r_000000_1000: Got 3 new map-outputs
2016-12-19 13:18:10,626 INFO [fetcher#13] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning XXX with 1 to fetcher#13
2016-12-19 13:18:10,627 INFO [fetcher#13] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 1 of 1 to XXX to fetcher#13
2016-12-19 13:18:10,714 INFO [fetcher#13] org.apache.hadoop.mapreduce.task.reduce.Fetcher: for url=13562/mapOutput?job=job_1481914266056_2014&reduce=0&map=attempt_1481914266056_2014_m_000001_1001 sent hash and received reply
2016-12-19 13:18:10,714 INFO [fetcher#11] org.apache.hadoop.mapreduce.task.reduce.Fetcher: for url=13562/mapOutput?job=job_1481914266056_2014&reduce=0&map=attempt_1481914266056_2014_m_000000_1001 sent hash and received reply
2016-12-19 13:18:10,714 INFO [fetcher#7] org.apache.hadoop.mapreduce.task.reduce.Fetcher: for url=13562/mapOutput?job=job_1481914266056_2014&reduce=0&map=attempt_1481914266056_2014_m_000002_1000 sent hash and received reply
2016-12-19 13:18:10,718 INFO [fetcher#7] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#7 about to shuffle output of map attempt_1481914266056_2014_m_000002_1000 decomp: 150 len: 154 to MEMORY
2016-12-19 13:18:10,718 INFO [fetcher#13] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#13 about to shuffle output of map attempt_1481914266056_2014_m_000001_1001 decomp: 597 len: 601 to MEMORY
2016-12-19 13:18:10,718 INFO [fetcher#11] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#11 about to shuffle output of map attempt_1481914266056_2014_m_000000_1001 decomp: 1185 len: 1189 to MEMORY
2016-12-19 13:18:10,720 INFO [fetcher#7] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 150 bytes from map-output for attempt_1481914266056_2014_m_000002_1000
2016-12-19 13:18:10,720 INFO [fetcher#13] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 597 bytes from map-output for attempt_1481914266056_2014_m_000001_1001
2016-12-19 13:18:10,720 INFO [fetcher#11] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 1185 bytes from map-output for attempt_1481914266056_2014_m_000000_1001
2016-12-19 13:18:10,721 INFO [fetcher#7] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 150, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->1932
2016-12-19 13:18:10,722 INFO [fetcher#11] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1185, inMemoryMapOutputs.size() -> 2, commitMemory -> 150, usedMemory ->1932
2016-12-19 13:18:10,722 INFO [fetcher#7] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: XXX freed by fetcher#7 in 97ms
2016-12-19 13:18:10,722 INFO [fetcher#13] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 597, inMemoryMapOutputs.size() -> 3, commitMemory -> 1335, usedMemory ->1932
2016-12-19 13:18:10,723 INFO [fetcher#13] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: XXX freed by fetcher#13 in 96ms
2016-12-19 13:18:10,723 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher: EventFetcher is interrupted.. Returning
2016-12-19 13:18:10,723 INFO [fetcher#11] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: XXX freed by fetcher#11 in 98ms
2016-12-19 13:18:10,729 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: finalMerge called with 3 in-memory map-outputs and 0 on-disk map-outputs
2016-12-19 13:18:10,734 INFO [main] org.apache.hadoop.mapred.Merger: Merging 3 sorted segments
2016-12-19 13:18:10,734 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 3 segments left of total size: 594 bytes
2016-12-19 13:18:10,739 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: Merged 3 segments, 1932 bytes to disk to satisfy reduce memory limit
2016-12-19 13:18:10,739 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: Merging 1 files, 1932 bytes from disk
2016-12-19 13:18:10,740 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
2016-12-19 13:18:10,740 INFO [main] org.apache.hadoop.mapred.Merger: Merging 1 sorted segments
2016-12-19 13:18:10,742 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1337 bytes

End of LogType:syslog.shuffle