Member since
09-17-2014
88
Posts
3
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2667 | 07-15-2015 08:57 PM | |
9420 | 07-15-2015 06:32 PM |
07-26-2015
04:47 PM
Hi dear experts! i discovering Spark's persist capabilities and noted interesting behaivour of DISK_ONLY persistance. as far as i understand the main goal - to store reusable and intermediate RDDs, that were produced from permanent data (that lays on HDFS). import org.apache.spark.storage.StorageLevel
val input = sc.textFile("/user/hive/warehouse/big_table");
val result = input.coalesce(600).persist(StorageLevel.DISK_ONLY)
scala> result.count()
……
// and repeat command
……..
scala> result.count() so, i was surprised when saw that second iteration was significantly faster... could anybody describe why? thanks!
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
-
HDFS
07-18-2015
10:35 AM
Hi everyone! i trying to understand Sort shuffle in spark and will very appreciate if someone could answer on simple question, let's imagine: 1) i have 600 partitions (HDFS blocks, for simplicity) 2) it place in 6 node cluster 3) i run spark with follow parameters: --executor-memory 13G --executor-cores 6 --num-executors 12 --driver-memory 1G --properties-file my-config.conf that's mean that on each server i will have 2 executor with 6 core each. 4) according my config reduce phase has only 3 reducers. so, ny question is how many files on each servers will be after Sort Shuffle: - 12 like a active map task - 2 like a number of executors on each server - 100 like a number of partitions that place on this server (for simplicity i just devide 600 on 6) and the second question is how names buffer for storing intermediate data before spill it on disk on the map stage? thanks!
... View more
Labels:
- Labels:
-
Apache Spark
-
HDFS
07-15-2015
09:02 PM
just in case, did you try to run: invalidate metadata; statement?
... View more
07-15-2015
08:57 PM
I found in source code: https://github.com/boorad/impala/blob/master/be/src/runtime/plan-fragment-executor.h // Average number of thread tokens for the duration of the plan fragment execution. // Fragments that do a lot of cpu work (non-coordinator fragment) will have at // least 1 token. Fragments that contain a hdfs scan node will have 1+ tokens // depending on system load. Other nodes (e.g. hash join node) can also reserve // additional tokens. // This is a measure of how much CPU resources this fragment used during the course // of the execution. RuntimeProfile::Counter* average_thread_tokens_;
... View more
07-15-2015
06:32 PM
Actually problem was in very agressive caching and overfilling spark.yarn.executor.memoryOverhead buffer and as cosequence OOM error. i just increase it and everything works now
... View more
07-15-2015
06:29 PM
Hi dear experts! does anybodu know what does mean AverageThreadTokens in Impala's query profile? thanks!
... View more
Labels:
- Labels:
-
Apache Impala
07-08-2015
08:44 PM
I found something in the YARN logs: 15/07/08 23:24:28 WARN spark.CacheManager: Persisting partition rdd_4_174 to disk instead.
15/07/08 23:24:29 INFO executor.Executor: Executor is trying to kill task 170.0 in stage 0.0 (TID 235)
15/07/08 23:24:29 INFO executor.Executor: Executor is trying to kill task 171.0 in stage 0.0 (TID 236)
15/07/08 23:24:29 INFO executor.Executor: Executor is trying to kill task 173.0 in stage 0.0 (TID 238)
15/07/08 23:24:29 INFO executor.Executor: Executor is trying to kill task 174.0 in stage 0.0 (TID 239)
15/07/08 23:24:29 WARN storage.BlockManager: Putting block rdd_4_174 failed
15/07/08 23:24:29 INFO executor.Executor: Executor killed task 174.0 in stage 0.0 (TID 239)
15/07/08 23:24:29 INFO executor.Executor: Executor killed task 173.0 in stage 0.0 (TID 238)
15/07/08 23:24:30 INFO storage.MemoryStore: ensureFreeSpace(255696059) called with curMem=418412, maxMem=2222739947
15/07/08 23:24:30 INFO storage.MemoryStore: Block rdd_4_170 stored as bytes in memory (estimated size 243.9 MB, free 1875.5 MB)
15/07/08 23:24:30 INFO storage.BlockManagerMaster: Updated info of block rdd_4_170
15/07/08 23:24:30 INFO executor.Executor: Executor killed task 170.0 in stage 0.0 (TID 235)
15/07/08 23:24:30 INFO storage.MemoryStore: ensureFreeSpace(255621319) called with curMem=256114471, maxMem=2222739947
15/07/08 23:24:30 INFO storage.MemoryStore: Block rdd_4_171 stored as bytes in memory (estimated size 243.8 MB, free 1631.7 MB)
15/07/08 23:24:30 INFO storage.BlockManagerMaster: Updated info of block rdd_4_171
15/07/08 23:24:30 INFO executor.Executor: Executor killed task 171.0 in stage 0.0 (TID 236) But i still have no idea why executor started kill tasks...
... View more
07-08-2015
08:31 PM
Hi dear experts! i running on my spark cluster (yarn-client mode) follow simple test script: import org.apache.spark.storage.StorageLevel
val input = sc.textFile("/user/hive/warehouse/tpc_ds_3T/...");
val result = input.coalesce(600).persist(StorageLevel.MEMORY_AND_DISK_SER)
result.count() RDD much higher then memory, but i specify disk option. in some time i start observing warnings like this: 15/07/08 23:20:38 WARN TaskSetManager: Lost task 33.1 in stage 0.0 (TID 104, host4: ExecutorLostFailure (executor 15 lost) and finaly i got: 15/07/08 23:14:41 INFO BlockManagerMasterActor: Registering block manager SomeHost2:16768 with 2.8 GB RAM, BlockManagerId(58, SomeHost2, 16768)
15/07/08 23:14:43 WARN TaskSetManager: Lost task 41.2 in stage 0.0 (TID 208, scaj43bda03.us.oracle.com): java.io.IOException: Failed to connect to SomeHost2/192.168.42.92:37305
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:191)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156)
at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:78)
at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
at org.apache.spark.network.shuffle.RetryingBlockFetcher.access$200(RetryingBlockFetcher.java:43)
at org.apache.spark.network.shuffle.RetryingBlockFetcher$1.run(RetryingBlockFetcher.java:170)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: SomeHost2/192.168.42.92:37305
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:208)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:287)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
... 1 more honestly i don't know where i can start my debuging... will appreciate any advice! thanks!
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
-
Apache YARN