About srikanth_ch45

srikanth_ch45 · ‎08-18-2016

@ Constantin So can i say SparkStandalone cluster is good for less number of node cluster(maybe less than 10 nodes) because of the fact that resource management performance decreases if we increase the node count in Spark Standalone cluster mode?

srikanth_ch45 · ‎08-17-2016

@ Rahul am asking about the use case difference. I mean when to use 'SparkStandalone' and when to use 'Spark with YARN' ?

srikanth_ch45 · ‎08-17-2016

Hi, Can any one please clarify my understanding on the use case difference between 'SparkStandalone' and 'Spark on YARN' cluster. Spark Standalone Cluster: If we do not have huge volume of data to process. If number of nodes required to process data are something less than 10 nodes. Then good to go with Standalone cluster. Spark on YARN Cluster: If you have huge volume of data to process and had to use more number of nodes and hence you need a better cluster manager to manage these nodes. Then good to go with Spark on YARN cluster. Also can anyone please let me know the infrastructure specifications required for the 'Spark Standalone' cluster. For example in the case of 'Spark Standalone' if its having 10 Spark node cluster. Can we just have 1 reliable hardware/machine for cluster manager as a master node and rest of 9 machines as worker nodes as slave nodes?

srikanth_ch45 · ‎08-14-2016

Thankyou 🙂

srikanth_ch45 · ‎08-13-2016

Here is my understanding where HDFS is not required for Spark: If in case we are migrating the structured data from any database like Oracle to any noSQL data base like Cassandra using Spark/SparkSQL Job, then in this case we do not need any storage like HDFS. Please correct me if am wrong. Thanks

srikanth_ch45 · ‎08-13-2016

Hi, Does Apache 'Spark Standalone' need HDFS? If it's required how Spark uses the HDFS block size during the Spark application execution. I mean am trying to understand what will be the HDFS role during Spark application execution. Spark documentation says that the processing parallelism is controlled through RDD partitions and the executors/cores. Can anyone please help me to understand.

Online	Offline
Last Visited	‎09-07-2016 12:17 PM

Member Since	‎08-13-2016 06:11 PM
Last Visited	‎09-07-2016 12:17 PM
Posts	9
Kudos received	2

Cloudera Community

Re: Spark Standalone Cluster.

Re: Spark Standalone Cluster.

Spark Standalone Cluster.

Re: Spark Standalone need of HDFS

Re: Spark Standalone need of HDFS

Spark Standalone need of HDFS