About fil

fil · ‎07-26-2015

Hi dear experts! i discovering Spark's persist capabilities and noted interesting behaivour of DISK_ONLY persistance. as far as i understand the main goal - to store reusable and intermediate RDDs, that were produced from permanent data (that lays on HDFS). import org.apache.spark.storage.StorageLevel val input = sc.textFile("/user/hive/warehouse/big_table"); val result = input.coalesce(600).persist(StorageLevel.DISK_ONLY) scala> result.count() …… // and repeat command …….. scala> result.count() so, i was surprised when saw that second iteration was significantly faster... could anybody describe why? thanks!

fil · ‎07-18-2015

Hi everyone! i trying to understand Sort shuffle in spark and will very appreciate if someone could answer on simple question, let's imagine: 1) i have 600 partitions (HDFS blocks, for simplicity) 2) it place in 6 node cluster 3) i run spark with follow parameters: --executor-memory 13G --executor-cores 6 --num-executors 12 --driver-memory 1G --properties-file my-config.conf that's mean that on each server i will have 2 executor with 6 core each. 4) according my config reduce phase has only 3 reducers. so, ny question is how many files on each servers will be after Sort Shuffle: - 12 like a active map task - 2 like a number of executors on each server - 100 like a number of partitions that place on this server (for simplicity i just devide 600 on 6) and the second question is how names buffer for storing intermediate data before spill it on disk on the map stage? thanks!

fil · ‎07-15-2015

just in case, did you try to run: invalidate metadata; statement?

fil · ‎07-15-2015

I found in source code: https://github.com/boorad/impala/blob/master/be/src/runtime/plan-fragment-executor.h // Average number of thread tokens for the duration of the plan fragment execution. // Fragments that do a lot of cpu work (non-coordinator fragment) will have at // least 1 token. Fragments that contain a hdfs scan node will have 1+ tokens // depending on system load. Other nodes (e.g. hash join node) can also reserve // additional tokens. // This is a measure of how much CPU resources this fragment used during the course // of the execution. RuntimeProfile::Counter* average_thread_tokens_;

fil · ‎07-15-2015

Actually problem was in very agressive caching and overfilling spark.yarn.executor.memoryOverhead buffer and as cosequence OOM error. i just increase it and everything works now

fil · ‎07-15-2015

Hi dear experts! does anybodu know what does mean AverageThreadTokens in Impala's query profile? thanks!

fil · ‎07-08-2015

I found something in the YARN logs: 15/07/08 23:24:28 WARN spark.CacheManager: Persisting partition rdd_4_174 to disk instead. 15/07/08 23:24:29 INFO executor.Executor: Executor is trying to kill task 170.0 in stage 0.0 (TID 235) 15/07/08 23:24:29 INFO executor.Executor: Executor is trying to kill task 171.0 in stage 0.0 (TID 236) 15/07/08 23:24:29 INFO executor.Executor: Executor is trying to kill task 173.0 in stage 0.0 (TID 238) 15/07/08 23:24:29 INFO executor.Executor: Executor is trying to kill task 174.0 in stage 0.0 (TID 239) 15/07/08 23:24:29 WARN storage.BlockManager: Putting block rdd_4_174 failed 15/07/08 23:24:29 INFO executor.Executor: Executor killed task 174.0 in stage 0.0 (TID 239) 15/07/08 23:24:29 INFO executor.Executor: Executor killed task 173.0 in stage 0.0 (TID 238) 15/07/08 23:24:30 INFO storage.MemoryStore: ensureFreeSpace(255696059) called with curMem=418412, maxMem=2222739947 15/07/08 23:24:30 INFO storage.MemoryStore: Block rdd_4_170 stored as bytes in memory (estimated size 243.9 MB, free 1875.5 MB) 15/07/08 23:24:30 INFO storage.BlockManagerMaster: Updated info of block rdd_4_170 15/07/08 23:24:30 INFO executor.Executor: Executor killed task 170.0 in stage 0.0 (TID 235) 15/07/08 23:24:30 INFO storage.MemoryStore: ensureFreeSpace(255621319) called with curMem=256114471, maxMem=2222739947 15/07/08 23:24:30 INFO storage.MemoryStore: Block rdd_4_171 stored as bytes in memory (estimated size 243.8 MB, free 1631.7 MB) 15/07/08 23:24:30 INFO storage.BlockManagerMaster: Updated info of block rdd_4_171 15/07/08 23:24:30 INFO executor.Executor: Executor killed task 171.0 in stage 0.0 (TID 236) But i still have no idea why executor started kill tasks...

fil · ‎07-08-2015

Hi dear experts! i running on my spark cluster (yarn-client mode) follow simple test script: import org.apache.spark.storage.StorageLevel val input = sc.textFile("/user/hive/warehouse/tpc_ds_3T/..."); val result = input.coalesce(600).persist(StorageLevel.MEMORY_AND_DISK_SER) result.count() RDD much higher then memory, but i specify disk option. in some time i start observing warnings like this: 15/07/08 23:20:38 WARN TaskSetManager: Lost task 33.1 in stage 0.0 (TID 104, host4: ExecutorLostFailure (executor 15 lost) and finaly i got: 15/07/08 23:14:41 INFO BlockManagerMasterActor: Registering block manager SomeHost2:16768 with 2.8 GB RAM, BlockManagerId(58, SomeHost2, 16768) 15/07/08 23:14:43 WARN TaskSetManager: Lost task 41.2 in stage 0.0 (TID 208, scaj43bda03.us.oracle.com): java.io.IOException: Failed to connect to SomeHost2/192.168.42.92:37305 at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:191) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156) at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:78) at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140) at org.apache.spark.network.shuffle.RetryingBlockFetcher.access$200(RetryingBlockFetcher.java:43) at org.apache.spark.network.shuffle.RetryingBlockFetcher$1.run(RetryingBlockFetcher.java:170) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused: SomeHost2/192.168.42.92:37305 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:208) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:287) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) ... 1 more honestly i don't know where i can start my debuging... will appreciate any advice! thanks!

fil · ‎07-08-2015

thanks, Alex

fil · ‎07-08-2015

thanks for your comment!

Online	Offline
Last Visited	‎02-26-2020 04:56 PM

Member Since	‎09-17-2014 01:36 AM
Last Visited	‎02-26-2020 04:56 PM
Posts	88
Kudos received	3

Cloudera Community

Re: What does mean AverageThreadTokens in impala's...

Re: Spark's faill durring persist()

Benefit of DISK_ONLY persists

Number of intermediate files with Sort shuffle in ...

Re: Impala won't update stats on Hive Avro table

Re: What does mean AverageThreadTokens in impala's...

Re: Spark's faill durring persist()

What does mean AverageThreadTokens in impala's que...

Re: Spark's faill durring persist()

Spark's faill durring persist()

Re: Force encoding type for given column

Re: Parquet index page for Impala