Support Questions

Find answers, ask questions, and share your expertise

SparkSQL very slow writing ro ORC table (and it does not work)

avatar
Expert Contributor

Hi everybody,

we are in trouble trying to write to a ORC partitioned table with SNAPPY compression via SparkSQL Thriftserver. We are using 40 executors with 16 GB of memory and we have set 1000 as default parallelism. Our query for writing into the table is a very complex insert overwrite. We are using HDP 2.3.2 with Hive 1.2.1 and Spark 1.6.0.

The result is that the query is rather fast until the last two stages (let's say about 10 min): than there is a stage running for about 15-20 min and then in the last one some very strange things happen. All the tasks (999/1000) finish in few seconds, but one which lasted 13.8 hours. But at the end the job says to be succeded.

Moreover the result is that in the folder of the table there are only staging folders and no folder for the partitions.

The logs for this container ends like this:

16/04/14 18:26:01 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
16/04/14 18:26:01 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
16/04/14 18:50:44 INFO orc.WriterImpl: Padding ORC by 3109869 bytes (<=  0.05 * 67108864)
16/04/14 19:15:06 INFO orc.WriterImpl: Padding ORC by 3347129 bytes (<=  0.05 * 67108864)
16/04/14 19:40:12 INFO orc.WriterImpl: Padding ORC by 3209596 bytes (<=  0.05 * 67108864)
16/04/14 20:01:41 INFO orc.WriterImpl: Padding ORC by 3217484 bytes (<=  0.05 * 67108864)
16/04/14 20:04:50 INFO orc.WriterImpl: Padding ORC by 3233888 bytes (<=  0.05 * 67108864)
16/04/14 20:29:34 INFO orc.WriterImpl: Padding ORC by 2675806 bytes (<=  0.04 * 67108864)
16/04/14 20:53:55 INFO orc.WriterImpl: Padding ORC by 2860597 bytes (<=  0.04 * 67108864)
16/04/14 21:19:11 INFO orc.WriterImpl: Padding ORC by 2783728 bytes (<=  0.04 * 67108864)
16/04/14 21:36:41 INFO orc.WriterImpl: Padding ORC by 3045962 bytes (<=  0.05 * 67108864)
16/04/14 21:44:54 INFO orc.WriterImpl: Padding ORC by 2507885 bytes (<=  0.04 * 67108864)
16/04/14 22:01:53 INFO orc.WriterImpl: Padding ORC by 2991017 bytes (<=  0.04 * 67108864)
16/04/14 22:10:37 INFO orc.WriterImpl: Padding ORC by 3161339 bytes (<=  0.05 * 67108864)
16/04/14 22:35:04 INFO orc.WriterImpl: Padding ORC by 3051066 bytes (<=  0.05 * 67108864)
16/04/14 22:58:41 INFO orc.WriterImpl: Padding ORC by 3056267 bytes (<=  0.05 * 67108864)
16/04/14 23:10:14 INFO orc.WriterImpl: Padding ORC by 2996341 bytes (<=  0.04 * 67108864)
16/04/14 23:22:57 INFO orc.WriterImpl: Padding ORC by 3022895 bytes (<=  0.05 * 67108864)
16/04/14 23:40:37 INFO orc.WriterImpl: Padding ORC by 3138785 bytes (<=  0.05 * 67108864)
16/04/14 23:45:33 INFO orc.WriterImpl: Padding ORC by 2990566 bytes (<=  0.04 * 67108864)
16/04/14 23:47:00 INFO orc.WriterImpl: Padding ORC by 3237637 bytes (<=  0.05 * 67108864)
16/04/14 23:47:46 INFO orc.WriterImpl: Padding ORC by 3346376 bytes (<=  0.05 * 67108864)
16/04/15 00:11:08 INFO orc.WriterImpl: Padding ORC by 3276991 bytes (<=  0.05 * 67108864)
16/04/15 00:12:18 INFO orc.WriterImpl: Padding ORC by 3223207 bytes (<=  0.05 * 67108864)
16/04/15 00:23:20 INFO orc.WriterImpl: Padding ORC by 3246542 bytes (<=  0.05 * 67108864)
16/04/15 00:36:00 INFO orc.WriterImpl: Padding ORC by 2724009 bytes (<=  0.04 * 67108864)
16/04/15 00:42:53 INFO orc.WriterImpl: Padding ORC by 2986568 bytes (<=  0.04 * 67108864)
16/04/15 01:00:45 INFO orc.WriterImpl: Padding ORC by 2469504 bytes (<=  0.04 * 67108864)
16/04/15 01:25:44 INFO orc.WriterImpl: Padding ORC by 2918348 bytes (<=  0.04 * 67108864)
16/04/15 01:32:45 INFO orc.WriterImpl: Padding ORC by 3300978 bytes (<=  0.05 * 67108864)
16/04/15 01:51:43 INFO orc.WriterImpl: Padding ORC by 2950727 bytes (<=  0.04 * 67108864)
16/04/15 02:17:40 INFO orc.WriterImpl: Padding ORC by 3191617 bytes (<=  0.05 * 67108864)
16/04/15 02:19:51 INFO orc.WriterImpl: Padding ORC by 3103657 bytes (<=  0.05 * 67108864)
16/04/15 02:43:46 INFO orc.WriterImpl: Padding ORC by 3275833 bytes (<=  0.05 * 67108864)
16/04/15 03:07:28 INFO orc.WriterImpl: Padding ORC by 3242815 bytes (<=  0.05 * 67108864)
16/04/15 03:31:44 INFO orc.WriterImpl: Padding ORC by 3120226 bytes (<=  0.05 * 67108864)
16/04/15 03:48:32 INFO orc.WriterImpl: Padding ORC by 3242937 bytes (<=  0.05 * 67108864)
16/04/15 03:51:54 INFO orc.WriterImpl: Padding ORC by 3107471 bytes (<=  0.05 * 67108864)
16/04/15 03:56:31 INFO orc.WriterImpl: Padding ORC by 3224306 bytes (<=  0.05 * 67108864)
16/04/15 04:22:50 INFO orc.WriterImpl: Padding ORC by 2874968 bytes (<=  0.04 * 67108864)
16/04/15 04:54:30 INFO orc.WriterImpl: Padding ORC by 2562310 bytes (<=  0.04 * 67108864)
16/04/15 05:10:32 INFO orc.WriterImpl: Padding ORC by 3206096 bytes (<=  0.05 * 67108864)
16/04/15 05:24:57 INFO orc.WriterImpl: Padding ORC by 3207364 bytes (<=  0.05 * 67108864)
16/04/15 05:25:20 INFO orc.WriterImpl: Padding ORC by 3157416 bytes (<=  0.05 * 67108864)
16/04/15 05:28:16 INFO orc.WriterImpl: Padding ORC by 2676707 bytes (<=  0.04 * 67108864)
16/04/15 05:28:45 INFO orc.WriterImpl: Padding ORC by 3301803 bytes (<=  0.05 * 67108864)
16/04/15 05:49:43 INFO orc.WriterImpl: Padding ORC by 3272214 bytes (<=  0.05 * 67108864)
16/04/15 06:02:47 INFO orc.WriterImpl: Padding ORC by 2717547 bytes (<=  0.04 * 67108864)
16/04/15 06:20:37 WARN hdfs.DFSClient: Slow ReadProcessor read fields took 33457ms (threshold=30000ms); ack: seqno: 8109 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 664247 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[39.7.48.5:50010,DS-ae94fa6d-851f-4f77-8ddf-2544bc239048,DISK], DatanodeInfoWithStorage[39.7.48.4:50010,DS-38fc44de-de26-47df-a534-7764d4fed137,DISK], DatanodeInfoWithStorage[39.7.48.13:50010,DS-6593bc62-057b-4fe2-af16-1959dbb2b7d9,DISK]]
16/04/15 06:38:08 INFO orc.WriterImpl: Padding ORC by 3021233 bytes (<=  0.05 * 67108864)
16/04/15 06:40:56 INFO orc.WriterImpl: Padding ORC by 3296795 bytes (<=  0.05 * 67108864)
16/04/15 06:43:36 WARN hdfs.DFSClient: Slow ReadProcessor read fields took 30345ms (threshold=30000ms); ack: seqno: 2558 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 908459 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[39.7.48.5:50010,DS-abf6d435-2834-41b3-9518-3c45d8509321,DISK], DatanodeInfoWithStorage[39.7.48.6:50010,DS-36f92b00-18bc-46c4-bf63-c6fdd168ca31,DISK], DatanodeInfoWithStorage[39.7.48.16:50010,DS-8c422a16-55bb-4996-b83b-b6a4d0fd552d,DISK]]
16/04/15 07:02:19 WARN hdfs.DFSClient: Slow ReadProcessor read fields took 34524ms (threshold=30000ms); ack: seqno: 2608 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 868904 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[39.7.48.5:50010,DS-abf6d435-2834-41b3-9518-3c45d8509321,DISK], DatanodeInfoWithStorage[39.7.48.6:50010,DS-36f92b00-18bc-46c4-bf63-c6fdd168ca31,DISK], DatanodeInfoWithStorage[39.7.48.16:50010,DS-8c422a16-55bb-4996-b83b-b6a4d0fd552d,DISK]]
16/04/15 07:11:47 INFO orc.WriterImpl: Padding ORC by 3189973 bytes (<=  0.05 * 67108864)
16/04/15 07:13:36 INFO orc.WriterImpl: Padding ORC by 2592298 bytes (<=  0.04 * 67108864)
16/04/15 07:20:53 INFO output.FileOutputCommitter: Saved output of task 'attempt_201604141706_0054_m_000369_0' to hdfs://*******/apps/hive/warehouse/********/***********/.hive-staging_hive_2016-04-14_17-06-14_986_5609188253880108931-1/-ext-10000/_temporary/0/task_201604141706_0054_m_000369
16/04/15 07:20:53 INFO mapred.SparkHadoopMapRedUtil: attempt_201604141706_0054_m_000369_0: Committed
16/04/15 07:20:53 INFO executor.Executor: Finished task 369.0 in stage 54.0 (TID 21177). 17800 bytes result sent to driver

Does anybody have any idea about why it is not working fine and why it is so slow?

Thanks,Marco

1 ACCEPTED SOLUTION

avatar
Expert Contributor

It turned out that the problem was caused by a join with a subquery which made the data to be unevenly distributed among the partitions. Actually I don't know why this is happening but we solved by materializing the subquery.

Thank you or the support.

View solution in original post

6 REPLIES 6

avatar
Master Mentor
@Marco Gaido

Marco, Take a look on this thread https://community.hortonworks.com/questions/4067/snappy-vs-zlib-pros-and-cons-for-each-compression.h...

I would test the same task with zlib to check the different behavior.

avatar
Expert Contributor

But I am having troubles in writing data and in that thread Zlib is said to be slower in writing than SNAPPY, than this should go worse...

avatar
Master Mentor
@Marco Gaido

I think you should try with zlib to test it out.

avatar
Super Guru

@Marco Gaido

I doubt if compression cause 13 hrs delay, rather something else going from query planing side. Can you please share what exactly it doing in last stage? may be you can check it on app master UI page while your query running.

avatar
Expert Contributor

How can I say this to you? I only see the stages but I don't know what they refer to or what they are doing... Anyway I believe that it is writing the result into ORC files....

avatar
Expert Contributor

It turned out that the problem was caused by a join with a subquery which made the data to be unevenly distributed among the partitions. Actually I don't know why this is happening but we solved by materializing the subquery.

Thank you or the support.