Member since
01-09-2017
55
Posts
14
Kudos Received
7
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1760 | 09-05-2016 10:38 AM | |
876 | 07-20-2016 08:22 AM | |
1755 | 07-04-2016 08:13 AM | |
530 | 06-03-2016 08:01 AM | |
932 | 05-05-2016 12:37 PM |
09-06-2016
07:53 AM
Could you please post the full stack trace of the exception? It looks like the indexer is not creating properly the label_idx column...
... View more
09-05-2016
10:38 AM
No, that code is not using cross-validation. An example about how to use cross validation can be found here. It needs the DataFrame API, so you should refer to this for the Random Forest implementation.
... View more
07-25-2016
02:57 PM
You could use https://spark.apache.org/docs/1.6.1/api/python/pyspark.html#pyspark.RDD.zipWithUniqueId.
... View more
07-20-2016
01:44 PM
The easiest way is to use the method saveAsObjectFile and read it through the objectFile method... You can easily find them in Spark documentation for further details about them.
... View more
07-20-2016
08:22 AM
Tou can convert a org.apache.mahout.math.Vector into a org.apache.spark.mllib.linalg.Vector by using the iterateNonZero() or iterateAll() methods of org.apache.mahout.math.Vector. In fact, if you Vector is sparse the first option is the best. In this case you can build two arrays via the iterateNonZero: one containing all the non-zero indexes and the other with the corresponding values, i.e. ArrayList<Double> values = new ArrayList<Double>();
ArrayList<Integer> indexes = new ArrayList<Integer>();
org.apache.mahout.math.Vector v = ...
Iterator<Element> it = v.iterateNonZero();
while(it.hasNext()){
Element e = it.next();
values.add(e.get());
indexes.add(e.index());
}
Vectors.sparse(v.size(), indexes.toArray(new Integer[indexes.size()]) ,values.toArray(new Double[values.size()])); You can do the same thing if you have a dense Vector using the iterateAll() method and Vectors.dense.
... View more
07-07-2016
09:56 AM
You're saying that you're changing "Enable authorization". I'm saying that you have to change "Choose authorization". They are different things. "Enable authorization" is in the Advanced tab, "Choose authorization" is in the Settings tab.
... View more
07-05-2016
07:03 AM
Because you've not done what I told you. You have to go in the config tab in Ambari, in the Settings tab (not Advanced) and change "Choose authorization" to "None" (instead of Ranger) in the Security area.
... View more
07-04-2016
08:13 AM
1 Kudo
The issue is related to the fact that Ranger is used for the authorization. You just need to go on Hive config tab in Ambari, select None as Authorization in the Security section and restart Hive.
... View more
06-30-2016
07:08 AM
Very interesting, but I think it'd have been as more even comparision if you'd have used SparkSQL csv reader from databricks to read the file for DataFrame and SparkSQL tests, otherwise there is the overhead of converting the RDD to a DataFrame...
... View more
06-28-2016
07:11 AM
Which is the problem using a local file? Indeed is what you have to do... There is no reason to specify the path of the file on hdfs.
... View more
06-27-2016
07:22 AM
Maybe it's not the only issue, but you have to specify an alias for the subquery. Try doing this and let us know if you have other issues or the same error still remains...
... View more
06-27-2016
07:18 AM
You can simply use spark-submit, which is in the bin folder of your spark-client installation. Here you can find the documentation for it: http://spark.apache.org/docs/latest/submitting-applications.html
... View more
06-01-2016
10:33 AM
1 Kudo
If you do just rm you're actually moving your data to the Trash. In order to remove the data from HDFS and free space, when you do the rm you have to put the flag -skipTrash.
In order to delete the data from the trash, you can run:
hdfs dfs -expunge
... View more
05-11-2016
08:49 PM
1 Kudo
With Hive what you can do is reading those fields as string and validate them through regexp. Otherwise, if you are sure that you don't have NULL values, you can simply define your schema and if a int field is NULL, this means that it is not properly formatted.
... View more
05-05-2016
12:37 PM
1 Kudo
In a HDP distribution it should be located in /usr/hdp/current/hadoop-mapreduce-client/
... View more
04-29-2016
01:52 PM
1 Kudo
That INFO only states that there were no TEZ session available and then a new one has to be created. But this is not an issue: the only problem is that you have a bit more time to start because it has to allocate the resources. The problem is more related to the Hive View which is not working properly showing you the results. Running the same query via beeline on command line you will see all the results.
... View more
04-22-2016
02:10 PM
I think that the best option for compiling scala Spark code is to use sbt,which is a tool for managing dependencies. You can do the same with Maven anyway, as you prefer.
... View more
04-20-2016
09:22 AM
1 Kudo
It turned out that the problem was caused by a join with a subquery which made the data to be unevenly distributed among the partitions. Actually I don't know why this is happening but we solved by materializing the subquery. Thank you or the support.
... View more
04-15-2016
12:21 PM
How can I say this to you? I only see the stages but I don't know what they refer to or what they are doing... Anyway I believe that it is writing the result into ORC files....
... View more
04-15-2016
09:28 AM
But I am having troubles in writing data and in that thread Zlib is said to be slower in writing than SNAPPY, than this should go worse...
... View more
04-15-2016
07:36 AM
Hi everybody, we are in trouble trying to write to a ORC partitioned table with SNAPPY compression via SparkSQL Thriftserver. We are using 40 executors with 16 GB of memory and we have set 1000 as default parallelism. Our query for writing into the table is a very complex insert overwrite. We are using HDP 2.3.2 with Hive 1.2.1 and Spark 1.6.0. The result is that the query is rather fast until the last two stages (let's say about 10 min): than there is a stage running for about 15-20 min and then in the last one some very strange things happen. All the tasks (999/1000) finish in few seconds, but one which lasted 13.8 hours. But at the end the job says to be succeded. Moreover the result is that in the folder of the table there are only staging folders and no folder for the partitions. The logs for this container ends like this: 16/04/14 18:26:01 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/04/14 18:26:01 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
16/04/14 18:50:44 INFO orc.WriterImpl: Padding ORC by 3109869 bytes (<= 0.05 * 67108864)
16/04/14 19:15:06 INFO orc.WriterImpl: Padding ORC by 3347129 bytes (<= 0.05 * 67108864)
16/04/14 19:40:12 INFO orc.WriterImpl: Padding ORC by 3209596 bytes (<= 0.05 * 67108864)
16/04/14 20:01:41 INFO orc.WriterImpl: Padding ORC by 3217484 bytes (<= 0.05 * 67108864)
16/04/14 20:04:50 INFO orc.WriterImpl: Padding ORC by 3233888 bytes (<= 0.05 * 67108864)
16/04/14 20:29:34 INFO orc.WriterImpl: Padding ORC by 2675806 bytes (<= 0.04 * 67108864)
16/04/14 20:53:55 INFO orc.WriterImpl: Padding ORC by 2860597 bytes (<= 0.04 * 67108864)
16/04/14 21:19:11 INFO orc.WriterImpl: Padding ORC by 2783728 bytes (<= 0.04 * 67108864)
16/04/14 21:36:41 INFO orc.WriterImpl: Padding ORC by 3045962 bytes (<= 0.05 * 67108864)
16/04/14 21:44:54 INFO orc.WriterImpl: Padding ORC by 2507885 bytes (<= 0.04 * 67108864)
16/04/14 22:01:53 INFO orc.WriterImpl: Padding ORC by 2991017 bytes (<= 0.04 * 67108864)
16/04/14 22:10:37 INFO orc.WriterImpl: Padding ORC by 3161339 bytes (<= 0.05 * 67108864)
16/04/14 22:35:04 INFO orc.WriterImpl: Padding ORC by 3051066 bytes (<= 0.05 * 67108864)
16/04/14 22:58:41 INFO orc.WriterImpl: Padding ORC by 3056267 bytes (<= 0.05 * 67108864)
16/04/14 23:10:14 INFO orc.WriterImpl: Padding ORC by 2996341 bytes (<= 0.04 * 67108864)
16/04/14 23:22:57 INFO orc.WriterImpl: Padding ORC by 3022895 bytes (<= 0.05 * 67108864)
16/04/14 23:40:37 INFO orc.WriterImpl: Padding ORC by 3138785 bytes (<= 0.05 * 67108864)
16/04/14 23:45:33 INFO orc.WriterImpl: Padding ORC by 2990566 bytes (<= 0.04 * 67108864)
16/04/14 23:47:00 INFO orc.WriterImpl: Padding ORC by 3237637 bytes (<= 0.05 * 67108864)
16/04/14 23:47:46 INFO orc.WriterImpl: Padding ORC by 3346376 bytes (<= 0.05 * 67108864)
16/04/15 00:11:08 INFO orc.WriterImpl: Padding ORC by 3276991 bytes (<= 0.05 * 67108864)
16/04/15 00:12:18 INFO orc.WriterImpl: Padding ORC by 3223207 bytes (<= 0.05 * 67108864)
16/04/15 00:23:20 INFO orc.WriterImpl: Padding ORC by 3246542 bytes (<= 0.05 * 67108864)
16/04/15 00:36:00 INFO orc.WriterImpl: Padding ORC by 2724009 bytes (<= 0.04 * 67108864)
16/04/15 00:42:53 INFO orc.WriterImpl: Padding ORC by 2986568 bytes (<= 0.04 * 67108864)
16/04/15 01:00:45 INFO orc.WriterImpl: Padding ORC by 2469504 bytes (<= 0.04 * 67108864)
16/04/15 01:25:44 INFO orc.WriterImpl: Padding ORC by 2918348 bytes (<= 0.04 * 67108864)
16/04/15 01:32:45 INFO orc.WriterImpl: Padding ORC by 3300978 bytes (<= 0.05 * 67108864)
16/04/15 01:51:43 INFO orc.WriterImpl: Padding ORC by 2950727 bytes (<= 0.04 * 67108864)
16/04/15 02:17:40 INFO orc.WriterImpl: Padding ORC by 3191617 bytes (<= 0.05 * 67108864)
16/04/15 02:19:51 INFO orc.WriterImpl: Padding ORC by 3103657 bytes (<= 0.05 * 67108864)
16/04/15 02:43:46 INFO orc.WriterImpl: Padding ORC by 3275833 bytes (<= 0.05 * 67108864)
16/04/15 03:07:28 INFO orc.WriterImpl: Padding ORC by 3242815 bytes (<= 0.05 * 67108864)
16/04/15 03:31:44 INFO orc.WriterImpl: Padding ORC by 3120226 bytes (<= 0.05 * 67108864)
16/04/15 03:48:32 INFO orc.WriterImpl: Padding ORC by 3242937 bytes (<= 0.05 * 67108864)
16/04/15 03:51:54 INFO orc.WriterImpl: Padding ORC by 3107471 bytes (<= 0.05 * 67108864)
16/04/15 03:56:31 INFO orc.WriterImpl: Padding ORC by 3224306 bytes (<= 0.05 * 67108864)
16/04/15 04:22:50 INFO orc.WriterImpl: Padding ORC by 2874968 bytes (<= 0.04 * 67108864)
16/04/15 04:54:30 INFO orc.WriterImpl: Padding ORC by 2562310 bytes (<= 0.04 * 67108864)
16/04/15 05:10:32 INFO orc.WriterImpl: Padding ORC by 3206096 bytes (<= 0.05 * 67108864)
16/04/15 05:24:57 INFO orc.WriterImpl: Padding ORC by 3207364 bytes (<= 0.05 * 67108864)
16/04/15 05:25:20 INFO orc.WriterImpl: Padding ORC by 3157416 bytes (<= 0.05 * 67108864)
16/04/15 05:28:16 INFO orc.WriterImpl: Padding ORC by 2676707 bytes (<= 0.04 * 67108864)
16/04/15 05:28:45 INFO orc.WriterImpl: Padding ORC by 3301803 bytes (<= 0.05 * 67108864)
16/04/15 05:49:43 INFO orc.WriterImpl: Padding ORC by 3272214 bytes (<= 0.05 * 67108864)
16/04/15 06:02:47 INFO orc.WriterImpl: Padding ORC by 2717547 bytes (<= 0.04 * 67108864)
16/04/15 06:20:37 WARN hdfs.DFSClient: Slow ReadProcessor read fields took 33457ms (threshold=30000ms); ack: seqno: 8109 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 664247 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[39.7.48.5:50010,DS-ae94fa6d-851f-4f77-8ddf-2544bc239048,DISK], DatanodeInfoWithStorage[39.7.48.4:50010,DS-38fc44de-de26-47df-a534-7764d4fed137,DISK], DatanodeInfoWithStorage[39.7.48.13:50010,DS-6593bc62-057b-4fe2-af16-1959dbb2b7d9,DISK]]
16/04/15 06:38:08 INFO orc.WriterImpl: Padding ORC by 3021233 bytes (<= 0.05 * 67108864)
16/04/15 06:40:56 INFO orc.WriterImpl: Padding ORC by 3296795 bytes (<= 0.05 * 67108864)
16/04/15 06:43:36 WARN hdfs.DFSClient: Slow ReadProcessor read fields took 30345ms (threshold=30000ms); ack: seqno: 2558 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 908459 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[39.7.48.5:50010,DS-abf6d435-2834-41b3-9518-3c45d8509321,DISK], DatanodeInfoWithStorage[39.7.48.6:50010,DS-36f92b00-18bc-46c4-bf63-c6fdd168ca31,DISK], DatanodeInfoWithStorage[39.7.48.16:50010,DS-8c422a16-55bb-4996-b83b-b6a4d0fd552d,DISK]]
16/04/15 07:02:19 WARN hdfs.DFSClient: Slow ReadProcessor read fields took 34524ms (threshold=30000ms); ack: seqno: 2608 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 868904 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[39.7.48.5:50010,DS-abf6d435-2834-41b3-9518-3c45d8509321,DISK], DatanodeInfoWithStorage[39.7.48.6:50010,DS-36f92b00-18bc-46c4-bf63-c6fdd168ca31,DISK], DatanodeInfoWithStorage[39.7.48.16:50010,DS-8c422a16-55bb-4996-b83b-b6a4d0fd552d,DISK]]
16/04/15 07:11:47 INFO orc.WriterImpl: Padding ORC by 3189973 bytes (<= 0.05 * 67108864)
16/04/15 07:13:36 INFO orc.WriterImpl: Padding ORC by 2592298 bytes (<= 0.04 * 67108864)
16/04/15 07:20:53 INFO output.FileOutputCommitter: Saved output of task 'attempt_201604141706_0054_m_000369_0' to hdfs://*******/apps/hive/warehouse/********/***********/.hive-staging_hive_2016-04-14_17-06-14_986_5609188253880108931-1/-ext-10000/_temporary/0/task_201604141706_0054_m_000369
16/04/15 07:20:53 INFO mapred.SparkHadoopMapRedUtil: attempt_201604141706_0054_m_000369_0: Committed
16/04/15 07:20:53 INFO executor.Executor: Finished task 369.0 in stage 54.0 (TID 21177). 17800 bytes result sent to driver Does anybody have any idea about why it is not working fine and why it is so slow? Thanks,Marco
... View more
Labels:
- Labels:
-
Apache Spark
04-04-2016
12:17 PM
I found the problem. It was a mistake by me: I added the configuration of the tutorial (https://community.hortonworks.com/articles/4868/hive-oom-caused-by-javalangoutofmemoryerror-java-h.html) via the set command in beeline: instead, they have to be configured in the Tez section in Ambari...
... View more
04-04-2016
09:43 AM
I have already tried that, but nothing changed
... View more
04-04-2016
09:28 AM
1 Kudo
Hello everybody! We are trying to insert into a partitioned table the result of a complex query. The destination table is a partitioned table stored in ORC format with SNAPPY compression. This is failing due to a OutOfMemoryError in the shuffle phase, as documented in the following stacktrace: Container exited with a non-zero exit code 255
]], TaskAttempt 1 failed, info=[Error: exceptionThrown=org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in fetcher {Reducer_7} #840
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:385)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1550)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
at org.apache.tez.runtime.library.common.shuffle.HttpConnection.getInputStream(HttpConnection.java:254)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:348)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:257)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:169)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:184)
, errorMessage=Shuffle Runner Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in fetcher {Reducer_7} #840
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:385)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1550)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
at org.apache.tez.runtime.library.common.shuffle.HttpConnection.getInputStream(HttpConnection.java:254)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:348)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:257)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:169)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:184)
], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
at org.apache.hadoop.hive.ql.io.orc.OutStream.getNewOutputBuffer(OutStream.java:117)
at org.apache.hadoop.hive.ql.io.orc.OutStream.spill(OutStream.java:168)
at org.apache.hadoop.hive.ql.io.orc.OutStream.flush(OutStream.java:239)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:807)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.writeStripe(WriterImpl.java:1257)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1744)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:2133)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2425)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:106)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:197)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1016)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:300)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
at org.apache.hadoop.hive.ql.io.orc.OutStream.getNewInputBuffer(OutStream.java:107)
at org.apache.hadoop.hive.ql.io.orc.OutStream.spill(OutStream.java:223)
at org.apache.hadoop.hive.ql.io.orc.OutStream.flush(OutStream.java:239)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:807)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.writeStripe(WriterImpl.java:1257)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1744)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:2133)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2425)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:106)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:197)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1016)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:300)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:1008, Vertex vertex_1459433067819_0516_2_57 [Reducer 8] killed/failed due to:OWN_TASK_FAILURE]
ERROR : DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex re-running, vertexName=Map 1, vertexId=vertex_1459433067819_0516_2_36Vertex re-running, vertexName=Map 31, vertexId=vertex_1459433067819_0516_2_30Vertex re-running, vertexName=Map 56, vertexId=vertex_1459433067819_0516_2_08Vertex re-running, vertexName=Map 25, vertexId=vertex_1459433067819_0516_2_23Vertex re-running, vertexName=Map 26, vertexId=vertex_1459433067819_0516_2_24Vertex re-running, vertexName=Map 15, vertexId=vertex_1459433067819_0516_2_13Vertex re-running, vertexName=Map 16, vertexId=vertex_1459433067819_0516_2_14Vertex re-running, vertexName=Map 12, vertexId=vertex_1459433067819_0516_2_07Vertex re-running, vertexName=Map 14, vertexId=vertex_1459433067819_0516_2_11Vertex re-running, vertexName=Map 10, vertexId=vertex_1459433067819_0516_2_02Vertex re-running, vertexName=Map 13, vertexId=vertex_1459433067819_0516_2_09Vertex re-running, vertexName=Map 9, vertexId=vertex_1459433067819_0516_2_28Vertex re-running, vertexName=Map 11, vertexId=vertex_1459433067819_0516_2_03Vertex re-running, vertexName=Reducer 30, vertexId=vertex_1459433067819_0516_2_29Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1459433067819_0516_2_37Vertex re-running, vertexName=Reducer 3, vertexId=vertex_1459433067819_0516_2_38Vertex re-running, vertexName=Reducer 4, vertexId=vertex_1459433067819_0516_2_39Vertex re-running, vertexName=Reducer 5, vertexId=vertex_1459433067819_0516_2_40Vertex re-running, vertexName=Reducer 6, vertexId=vertex_1459433067819_0516_2_41Vertex re-running, vertexName=Reducer 7, vertexId=vertex_1459433067819_0516_2_42Vertex re-running, vertexName=Map 28, vertexId=vertex_1459433067819_0516_2_26Vertex re-running, vertexName=Map 17, vertexId=vertex_1459433067819_0516_2_15Vertex re-running, vertexName=Map 37, vertexId=vertex_1459433067819_0516_2_43Vertex re-running, vertexName=Reducer 44, vertexId=vertex_1459433067819_0516_2_50Vertex re-running, vertexName=Reducer 40, vertexId=vertex_1459433067819_0516_2_46Vertex re-running, vertexName=Map 20, vertexId=vertex_1459433067819_0516_2_21Vertex re-running, vertexName=Map 18, vertexId=vertex_1459433067819_0516_2_16Vertex re-running, vertexName=Map 35, vertexId=vertex_1459433067819_0516_2_34Vertex re-running, vertexName=Map 32, vertexId=vertex_1459433067819_0516_2_31Vertex re-running, vertexName=Map 33, vertexId=vertex_1459433067819_0516_2_32Vertex re-running, vertexName=Map 29, vertexId=vertex_1459433067819_0516_2_27Vertex re-running, vertexName=Reducer 34, vertexId=vertex_1459433067819_0516_2_33Vertex re-running, vertexName=Reducer 38, vertexId=vertex_1459433067819_0516_2_44Vertex re-running, vertexName=Map 45, vertexId=vertex_1459433067819_0516_2_51Vertex re-running, vertexName=Map 55, vertexId=vertex_1459433067819_0516_2_04Vertex re-running, vertexName=Map 53, vertexId=vertex_1459433067819_0516_2_00Vertex re-running, vertexName=Reducer 54, vertexId=vertex_1459433067819_0516_2_01Vertex re-running, vertexName=Reducer 36, vertexId=vertex_1459433067819_0516_2_35Vertex failed, vertexName=Reducer 8, vertexId=vertex_1459433067819_0516_2_57, diagnostics=[Task failed, taskId=task_1459433067819_0516_2_57_000038, diagnostics=[TaskAttempt 0 failed, info=[Container container_e17_1459433067819_0516_01_000058 finished with diagnostics set to [Container failed, exitCode=255. Exception from container-launch.
Container id: container_e17_1459433067819_0516_01_000058
Exit code: 255
Stack trace: ExitCodeException exitCode=255:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
at org.apache.hadoop.util.Shell.run(Shell.java:487)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 255
]], TaskAttempt 1 failed, info=[Error: exceptionThrown=org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in fetcher {Reducer_7} #840
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:385)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1550)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
at org.apache.tez.runtime.library.common.shuffle.HttpConnection.getInputStream(HttpConnection.java:254)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:348)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:257)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:169)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:184)
, errorMessage=Shuffle Runner Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in fetcher {Reducer_7} #840
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:385)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1550)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
at org.apache.tez.runtime.library.common.shuffle.HttpConnection.getInputStream(HttpConnection.java:254)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:348)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:257)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:169)
at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:184)
], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
at org.apache.hadoop.hive.ql.io.orc.OutStream.getNewOutputBuffer(OutStream.java:117)
at org.apache.hadoop.hive.ql.io.orc.OutStream.spill(OutStream.java:168)
at org.apache.hadoop.hive.ql.io.orc.OutStream.flush(OutStream.java:239)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:807)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.writeStripe(WriterImpl.java:1257)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1744)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:2133)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2425)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:106)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:197)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1016)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:300)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
at org.apache.hadoop.hive.ql.io.orc.OutStream.getNewInputBuffer(OutStream.java:107)
at org.apache.hadoop.hive.ql.io.orc.OutStream.spill(OutStream.java:223)
at org.apache.hadoop.hive.ql.io.orc.OutStream.flush(OutStream.java:239)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:807)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.writeStripe(WriterImpl.java:1257)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1744)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:2133)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2425)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:106)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:197)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1016)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:300)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:1008, Vertex vertex_1459433067819_0516_2_57 [Reducer 8] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 (state=08S01,code=2) I followed the steps told in this tutorial (https://community.hortonworks.com/articles/4868/hive-oom-caused-by-javalangoutofmemoryerror-java-h.html) but nothing changed. Moreover, writing the same data to a unpartitioned table works fine. Does anybody have any idea about how to fix it? Thanks, Marco
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Tez
03-29-2016
07:08 AM
1 Kudo
I can't give you the query since it's rather complex (about 1500 lines). Actually we haven't run ANALYZE for the columns... Asap we'll try and let you know. Thank you for your answer.
... View more
03-25-2016
04:30 PM
1 Kudo
Hello everybody, we are facing a strange Hive behavior (we are using HDP 2.3.2). It seems that Hive ignores hive.auto.convert.join.noconditionaltask.size parameter. Indeed, it converts all the joins to MapJoin even if in our queries we have several joins on very large table (some TB). We have hive.auto.convert.join.noconditionaltask set to true and hive.auto.convert.join.noconditionaltask.size to the value of about 1,5 GB. We have Tez as execution engine and the tables are stored as ORC. Does anybody have any idea about the reason of this Hive behavior? Thanks, Marco
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Tez