Member since
05-10-2017
35
Posts
1
Kudos Received
0
Solutions
11-04-2018
08:52 PM
Hi , I need to know why do specify executor cores for spark application running on yarn. Lets say we have a cluster with below mentioned specs: Number of worker nodes : 4 (50 GB memory and 8 Cores each) One Master Node of 50 GB memory and 8 cores. Now lets consider, a) if I have specified 8 num_executors for an application and I dont set executor-cores so will each executor going to use all the cores ? b) As each node has 8 cores, so what if I specify executor_cores = 4 so that means it will limit executor cores to be used for an executor should not exceed 4 while total cores per node are 8 ? c) What is the criteria to specify executor_cores for a spark application?
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
07-23-2018
06:04 PM
Hi , I have to locate the spark application yarn log on hdfs. When I go to hdfs://app-logs/user/logs-ifile then it shows container wise logs but I need consolidated log in file (like we get the consolidated log on CMD when we run spark-submit manually on CMD).So, from where can I locate the consolidated log of an application on hdfs?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Spark
-
Apache YARN
07-19-2018
11:51 AM
Hi, I am reading two files from S3 and taking their Union but code is failing when I run it on yarn . For single node it runs successfully and for cluster when I specify the -master yarn in spark-submit then it fails. Below is the PySpark Code: from pyspark import SparkConf, SparkContext, SQLContext
from pyspark.sql.types import *
from pyspark.sql.functions import *
from functools import reduce # For Python 3.x
from pyspark.sql import DataFrame
# set Spark properties
conf = (SparkConf()
.setAppName("s3a_test")
.set("spark.shuffle.compress", "true")
.set("spark.io.compression.codec", "snappy")
.set("fs.s3a.server-side-encryption-algorithm", "AES256")
)
sc = SparkContext(conf = conf)
sqlContext = SQLContext(sc)
#function to union multiple dataframes
def unionMultiDF(*dfs):
return reduce(DataFrame.union, dfs)
pfely = "s3a://ics/parquet/salestodist/"
pfely1 = "s3a://ics/parquet/salestodist/"
FCSTEly = sqlContext.read.parquet(pfely)
FCSTEly1 = sqlContext.read.parquet(pfely1)
FCSTEly.registerTempTable('PFEly')
FCSTEly1.registerTempTable('PFEly1')
#Mapping to Forecast
tbl_FCSTEly = sqlContext.sql("select * from PFEly limit 10 ")
tbl_FCSTEly1 = sqlContext.sql("select * from PFEly1 limit 10")
tbl_FCST = unionMultiDF(tbl_FCSTEly,tbl_FCSTEly1)
Yarn Log of failed container
===============================================================================
18/07/19 11:20:15 INFO RMProxy: Connecting to ResourceManager at ip-10-246-5-252.example.com/10.246.5.252:8030
18/07/19 11:20:15 INFO YarnRMClient: Registering the ApplicationMaster
18/07/19 11:20:15 INFO YarnAllocator: Will request 2 executor container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of overhead)
18/07/19 11:20:15 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@10.246.5.186:32769)
18/07/19 11:20:15 INFO YarnAllocator: Submitted 2 unlocalized container requests.
18/07/19 11:20:15 INFO ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
18/07/19 11:20:15 INFO AMRMClientImpl: Received new token for : ip-10-246-5-216.example.com:45454
18/07/19 11:20:15 INFO YarnAllocator: Launching container container_e15_1531738738949_0081_02_000002 on host ip-10-246-5-216.example.com for executor with ID 1
18/07/19 11:20:15 INFO YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them.
18/07/19 11:20:15 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
18/07/19 11:20:15 INFO ContainerManagementProtocolProxy: Opening proxy : ip-10-246-5-216.example.com:45454
18/07/19 11:20:18 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.246.5.216:52932) with ID 1
18/07/19 11:20:18 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-246-5-216.example.com:34919 with 912.3 MB RAM, BlockManagerId(1, ip-10-246-5-216.example.com, 34919, None)
18/07/19 11:20:44 INFO YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
18/07/19 11:20:44 INFO YarnClusterScheduler: YarnClusterScheduler.postStartHook done
18/07/19 11:20:44 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/hadoopfs/fs1/yarn/nodemanager/usercache/spark/appcache/application_1531738738949_0081/container_e15_1531738738949_0081_02_000001/spark-warehouse').
18/07/19 11:20:44 INFO SharedState: Warehouse path is 'file:/hadoopfs/fs1/yarn/nodemanager/usercache/spark/appcache/application_1531738738949_0081/container_e15_1531738738949_0081_02_000001/spark-warehouse'.
18/07/19 11:20:45 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
18/07/19 11:20:46 INFO deprecation: fs.s3a.server-side-encryption-key is deprecated. Instead, use fs.s3a.server-side-encryption.key
18/07/19 11:20:47 INFO SparkContext: Starting job: parquet at NativeMethodAccessorImpl.java:0
18/07/19 11:20:47 INFO DAGScheduler: Got job 0 (parquet at NativeMethodAccessorImpl.java:0) with 1 output partitions
18/07/19 11:20:47 INFO DAGScheduler: Final stage: ResultStage 0 (parquet at NativeMethodAccessorImpl.java:0)
18/07/19 11:20:47 INFO DAGScheduler: Parents of final stage: List()
18/07/19 11:20:47 INFO DAGScheduler: Missing parents: List()
18/07/19 11:20:47 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at parquet at NativeMethodAccessorImpl.java:0), which has no missing parents
18/07/19 11:20:47 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 92.0 KB, free 912.2 MB)
18/07/19 11:20:47 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 33.5 KB, free 912.2 MB)
18/07/19 11:20:47 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.246.5.186:32771 (size: 33.5 KB, free: 912.3 MB)
18/07/19 11:20:47 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006
18/07/19 11:20:47 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at parquet at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0))
18/07/19 11:20:47 INFO YarnClusterScheduler: Adding task set 0.0 with 1 tasks
18/07/19 11:20:47 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, ip-10-246-5-216.example.com, executor 1, partition 0, PROCESS_LOCAL, 5003 bytes)
18/07/19 11:20:47 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-246-5-216.example.com:34919 (size: 33.5 KB, free: 912.3 MB)
18/07/19 11:20:48 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1706 ms on ip-10-246-5-216.example.com (executor 1) (1/1)
18/07/19 11:20:48 INFO DAGScheduler: ResultStage 0 (parquet at NativeMethodAccessorImpl.java:0) finished in 1.713 s
18/07/19 11:20:48 INFO YarnClusterScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
18/07/19 11:20:49 INFO DAGScheduler: Job 0 finished: parquet at NativeMethodAccessorImpl.java:0, took 1.914374 s
18/07/19 11:20:49 INFO BlockManagerInfo: Removed broadcast_0_piece0 on ip-10-246-5-216.example.com:34919 in memory (size: 33.5 KB, free: 912.3 MB)
18/07/19 11:20:49 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 10.246.5.186:32771 in memory (size: 33.5 KB, free: 912.3 MB)
18/07/19 11:20:50 INFO SparkContext: Starting job: parquet at NativeMethodAccessorImpl.java:0
18/07/19 11:20:50 INFO DAGScheduler: Got job 1 (parquet at NativeMethodAccessorImpl.java:0) with 1 output partitions
18/07/19 11:20:50 INFO DAGScheduler: Final stage: ResultStage 1 (parquet at NativeMethodAccessorImpl.java:0)
18/07/19 11:20:50 INFO DAGScheduler: Parents of final stage: List()
18/07/19 11:20:50 INFO DAGScheduler: Missing parents: List()
18/07/19 11:20:50 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[3] at parquet at NativeMethodAccessorImpl.java:0), which has no missing parents
18/07/19 11:20:50 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 92.0 KB, free 912.2 MB)
18/07/19 11:20:50 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 33.5 KB, free 912.2 MB)
18/07/19 11:20:50 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 10.246.5.186:32771 (size: 33.5 KB, free: 912.3 MB)
18/07/19 11:20:50 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
18/07/19 11:20:50 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[3] at parquet at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0))
18/07/19 11:20:50 INFO YarnClusterScheduler: Adding task set 1.0 with 1 tasks
18/07/19 11:20:50 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, ip-10-246-5-216.example.com, executor 1, partition 0, PROCESS_LOCAL, 5003 bytes)
18/07/19 11:20:51 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on ip-10-246-5-216.example.com:34919 (size: 33.5 KB, free: 912.3 MB)
18/07/19 11:20:51 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 176 ms on ip-10-246-5-216.example.com (executor 1) (1/1)
18/07/19 11:20:51 INFO DAGScheduler: ResultStage 1 (parquet at NativeMethodAccessorImpl.java:0) finished in 0.175 s
18/07/19 11:20:51 INFO DAGScheduler: Job 1 finished: parquet at NativeMethodAccessorImpl.java:0, took 0.208283 s
18/07/19 11:20:51 INFO YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool
18/07/19 11:20:51 ERROR ApplicationMaster: User application exited with status 1
18/07/19 11:20:51 INFO ApplicationMaster: Final app status: FAILED, exitCode: 1, (reason: User application exited with status 1)
18/07/19 11:20:51 INFO SparkContext: Invoking stop() from shutdown hook
18/07/19 11:20:51 INFO SparkUI: Stopped Spark web UI at http://10.112.5.186:40567
18/07/19 11:20:51 INFO YarnAllocator: Driver requested a total number of 0 executor(s).
18/07/19 11:20:51 INFO YarnClusterSchedulerBackend: Shutting down all executors
18/07/19 11:20:51 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
18/07/19 11:20:51 INFO SchedulerExtensionServices: Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
18/07/19 11:20:51 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/07/19 11:20:51 INFO MemoryStore: MemoryStore cleared
18/07/19 11:20:51 INFO BlockManager: BlockManager stopped
18/07/19 11:20:51 INFO BlockManagerMaster: BlockManagerMaster stopped
18/07/19 11:20:51 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/07/19 11:20:51 INFO SparkContext: Successfully stopped SparkContext
18/07/19 11:20:51 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: User application exited with status 1)
18/07/19 11:20:51 INFO AMRMClientImpl: Waiting for application to be successfully unregistered.
18/07/19 11:20:51 INFO ApplicationMaster: Deleting staging directory hdfs://ip-10-112-5-252.example.com:8020/user/spark/.sparkStaging/application_1531738738949_0081
18/07/19 11:20:51 INFO ShutdownHookManager: Shutdown hook called
18/07/19 11:20:51 INFO ShutdownHookManager: Deleting directory /hadoopfs/fs1/yarn/nodemanager/usercache/spark/appcache/application_1531738738949_0081/spark-1acac860-733e-4f51-89fb-cb31bf1913ed/pyspark-5dc64ae0-97c1-4c5e-812e-e2931ef7933f
18/07/19 11:20:51 INFO ShutdownHookManager: Deleting directory /hadoopfs/fs1/yarn/nodemanager/usercache/spark/appcache/application_1531738738949_0081/spark-1acac860-733e-4f51-89fb-cb31bf1913ed
End of LogType:stderr
Spark-submit Command: sudo -u spark /usr/hdp/2.6.4.5-2/spark2/bin/spark-submit --master yarn --executor-memory 2g --executor-cores 1 --num-executors 2 --driver-memory 2g --driver-cores 1 --deploy-mode cluster hdfs:/code/salesforecast_u.py
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache YARN
06-09-2017
09:04 PM
@Eyad Garelnabi, I am using PutHiveql along with ReplaceText. Data flow is running succesfully but my insert-select did not execute because final table is empty in which I am performing insert-select. In replace text properties, I have left the search value as default and in Replacement Value I have written my insert-select query. When I checked the log, I found below error: Could not open client transport with JDBC Uri: jdbc:hive2://sandbox.hortonworks.com:2181/testdb: null
... View more
06-09-2017
07:31 PM
Hi, I have to first read a file from local directory using Getfile and then will be loading it into HDFS. Once the file is in HDFS, then I have copy it from hive based table and insert it into another hive table. For insert/select , I came to know that I need to use puthiveql through NIFI documentation. But I could not find any option in PutHiveQL where I can write my Insert-select query. Can anybody tell me that where is option in PutHiveQL to write the Hive query? Thanks in advance.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache NiFi
05-19-2017
11:38 AM
@Peter Greiff Thanks for your answer and its working. But I have one concern here, in fetch file I have to mention the file name along with absolute path. Lets say, in future I want to process the some other file with different name but the location so, I wont be able to use this data flow. Is there any way around to make it somehow generic such that a fetch file can process any file on given path? And I dont have to change the filename explicitly in fetch file configuration.
... View more
05-19-2017
07:47 AM
Hi, I have a scenario where I have to move the file from local folder1 to local folder2 and then move that file from local (folder2) to HDFS. I proposed the following flow: GetFile -> PutFile - > Get File > PutHDFS The problem is that I am unable to connect the Putfile with GetFile its not allowing me to drag the relationship. Actually, what I want is, once the file is moved from local folder1 to local folder2 then I will be moving it to HDFS.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache NiFi
05-19-2017
06:06 AM
@Wynner Yes.. But how can I remove this warning and why its occuring?
... View more
05-18-2017
05:27 PM
@Wynner There is no error when I clear the state. When I click the "Clear State" link it removes the time-stamp etc entries below is the state after clearing: Error only occurs when I run the process. And below is the error: WARN [Timer-Driven Process Thread-7] o.a.nifi.processors.standard.ListFile ListFile[id=1ae3923c-015c-1000-15e4-e29f5b75a0e0] Error collecting attributes for file /root/InputData/, message is Mount point not found
... View more
05-18-2017
05:11 PM
@Wynner Yes..It was stopped.
... View more