Member since
Kudos Received
My Accepted Solutions
Title | Views | Posted |
550 | 10-16-2024 06:28 PM |
06:44 PM
@Shelton Thank you for your reply. This information is very helpful.
... View more
04:39 PM
1 Kudo
In Apache Spark, spark_shuffle and spark2_shuffle are configuration options related to Spark's shuffle operations, which can be set to start auxiliary services within the Yarn NodeManager. But what is the difference between these two?
... View more
- Labels:
Apache Spark
Apache YARN
06:28 PM
1 Kudo
Hi everyone, Thank you all for your responses. I am using Spark 3, and I’ve discovered that the issue is due to the improper configuration of the spark_shuffle settings in the yarn-site.xml file. Thanks again!
... View more
12:37 AM
Hi Everyone, I am facing a problem that I try to insert data into hiverserver2 by spark thrift server (actually I use beeline), the job of insert is stucked. I have checked that spark MasterApplication UI page, and find that it shows as following figure. The log of spark thrift server is as following : 24/10/16 15:21:39 INFO SparkExecuteStatementOperation: Submitting query 'insert into test_database.test_table (a,b) values (2,33)' with a75190ac-d536-4ee1-a1ff-da42a195a40b 24/10/16 15:21:39 INFO SparkExecuteStatementOperation: Running query with a75190ac-d536-4ee1-a1ff-da42a195a40b 24/10/16 15:21:40 INFO FileUtils: Creating directory if it doesn't exist: hdfs://ha/warehouse/tablespace/managed/hive/test_database.db/test_table/.hive-staging_hive_2024-10-16_15-21-40_061_8849017887411502804-3 24/10/16 15:21:40 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 24/10/16 15:21:40 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false 24/10/16 15:21:40 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 24/10/16 15:21:40 INFO SparkContext: Starting job: run at 24/10/16 15:21:40 INFO DAGScheduler: Got job 2 (run at with 1 output partitions 24/10/16 15:21:40 INFO DAGScheduler: Final stage: ResultStage 2 (run at 24/10/16 15:21:40 INFO DAGScheduler: Parents of final stage: List() 24/10/16 15:21:40 INFO DAGScheduler: Missing parents: List() 24/10/16 15:21:40 INFO DAGScheduler: Submitting ResultStage 2 (MapPartitionsRDD[8] at run at, which has no missing parents 24/10/16 15:21:40 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 421.2 KiB, free 910.8 MiB) 24/10/16 15:21:40 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 147.0 KiB, free 910.6 MiB) 24/10/16 15:21:40 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on (size: 147.0 KiB, free: 911.9 MiB) 24/10/16 15:21:40 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1535 24/10/16 15:21:40 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 2 (MapPartitionsRDD[8] at run at (first 15 tasks are for partitions Vector(0)) 24/10/16 15:21:40 INFO YarnScheduler: Adding task set 2.0 with 1 tasks resource profile 0 24/10/16 15:21:40 INFO FairSchedulableBuilder: Added task set TaskSet_2.0 tasks to pool default 24/10/16 15:21:50 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 24/10/16 15:22:05 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 24/10/16 15:22:20 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 24/10/16 15:22:35 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 24/10/16 15:22:50 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 24/10/16 15:23:05 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources Please help me to figure out what happens, thanks a lot.
... View more
- Labels:
Apache Hive
Apache Spark