Support Questions

Find answers, ask questions, and share your expertise

How GIT & Jenkins are related to Hadoop/Spark jobs?

avatar
Rising Star

Hi there,

I know Jenkins & GIt in general. But, I'm not aware of how Jenkins/GIT plays role in Hadoop projects..

Please let me know your information on this. Thanks in advance.

Regards,

Jee

1 ACCEPTED SOLUTION

avatar
Super Guru

These tools are used similarly with any software SDLC, just you will be developing software being executed on a Hadoop/Spark cluster. You can still build your jars the same way and use GIT as your source code repository. You will be submitting the job for execution in a distributed cluster. However, there are pseudo clusters for development. For example you can use hadoop mini cluster: https://github.com/sakserv/hadoop-mini-clusters

A good reference on how to use this mini cluster for testing: http://www.lopakalogic.com/articles/hadoop-articles/hadoop-testing-with-minicluster/

For Spark development you could use Spark standalone.

View solution in original post

1 REPLY 1

avatar
Super Guru

These tools are used similarly with any software SDLC, just you will be developing software being executed on a Hadoop/Spark cluster. You can still build your jars the same way and use GIT as your source code repository. You will be submitting the job for execution in a distributed cluster. However, there are pseudo clusters for development. For example you can use hadoop mini cluster: https://github.com/sakserv/hadoop-mini-clusters

A good reference on how to use this mini cluster for testing: http://www.lopakalogic.com/articles/hadoop-articles/hadoop-testing-with-minicluster/

For Spark development you could use Spark standalone.