Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Hadoop 3.1.0 and Spark 2.4.5 installation using Ambari

Hi,

I need to setup a 5 node cluster with Hadoop 3.1.0 and Spark 2.4.5 . Someone recommended to use Ambari to do so. I checked Ambari but it seems Ambari can be used only to install HDP and latest HDP do not support Spark 2.4.5 version. 

 

Please suggest in this aspect, what will be the best way to setup the required big data cluster.

1 ACCEPTED SOLUTION

Cloudera Employee

Hi,

Thanks for the reply the approach that you have shared as below, is not supported with HDP cluster. 
"Can we install hadoop & spark 2.4.5 packages on multi node cluster without using hdp, ambari & cloudera"
So option here is to use the Spark 2.3.2 version that comes with HDP 3.1.5, we can also involve the technical support team of cloudera, if you are hitting any issue with the Spark 2.3.2. 
Thanks and Regards,
Vikas Dadhich

View solution in original post

4 REPLIES 4

Cloudera Employee

Hi, 

Thanks for creating a new thread seems you need help to setup spark 2.4.5 with Ambari. As you already mentioned that HDP does not support the spark 2.4.5.
I agree, we do have latest HDP 3.1.5 at this moment that comes with the Apache Spark 2.3.2 for more information you can refer the link[1]. 
Also, installing the upstream component versions is not supported so we are very limited with the resources. Please follow the support matrix to use the supported configuration as mentioned under the link[2]. 
link[1]: https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/release-notes/content/comp_versions.html
link[2]: https://supportmatrix.hortonworks.com/

Thanks and Regards,
Vikas Dadhich

Thanks  for the reply. Can we install hadoop & spark 2.4.5 packages on multi node cluster without using hdp, ambari & cloudera ? We already have spark applications running on spark 2.4.5 version and we do not want to go back to backward versions. Even we are planning to upgrade them soon to spark 3 because of better delta lake compatibility. 

 

If we install hadoop and spark packages manually on each node of the cluster, can there be any maintanance issues at later stage in production ?

Cloudera Employee

Hi,

Thanks for the reply the approach that you have shared as below, is not supported with HDP cluster. 
"Can we install hadoop & spark 2.4.5 packages on multi node cluster without using hdp, ambari & cloudera"
So option here is to use the Spark 2.3.2 version that comes with HDP 3.1.5, we can also involve the technical support team of cloudera, if you are hitting any issue with the Spark 2.3.2. 
Thanks and Regards,
Vikas Dadhich

I will check our spark 2.4.5 application code compatibility with spark 2.3.2 version. Is Ambari & HDP going to be discontinued in near future as part of cloudera and hortonworks merger going ? We need to plan our choice of softwares accordingly.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.