Support Questions
Find answers, ask questions, and share your expertise

Artificial intelligence on Hadoop

Explorer

Hello,

I want to understand if and how it is possible to use deep learning on Hadoop. If I'm right, there is no tool like Spark or something else, with which I use for ai.

The only way is to write java code using DeepLearning4J and execute this code using YARN.

I know that this is very basic but I only want to know if this is the right way and if there are other ways to use deep learning/AI within Hortonworks (as an example in the sandbox).

If you have good sources to this topic, it would be great to share them.

cheers

6 REPLIES 6

Re: Artificial intelligence on Hadoop

Expert Contributor

Re: Artificial intelligence on Hadoop

Super Guru
@Jan Horton

Few things here:

1. Of course you can use Spark on Hadoop. Check the following link for using Sparkling water on Hadoop on YARN.

http://www.h2o.ai/download/sparkling-water/

2. You can of course use DeepLearning4J but there are much better tools. Spark is the bets one and is supported by Hortonworks.

3. If you don't want to use H2O, then just use SparkMLLib. Link below:

http://spark.apache.org/docs/latest/ml-guide.html

Re: Artificial intelligence on Hadoop

Explorer
@Adnan Alvee

@mqureshi

Thank you very much for the further information. It helps me a lot to and with the new point of view it is easier to search for more information.

I must read more but am I right that,...

1. TensorFlow and H2O is a standalone tool for machine learning

2. I can install it on HDF to access my big data pool

3. I can use DeepLearning4J, SparkMLLib or other libs, to use machine learning without installing other tools

So there are libs I can use without any further installation, or I can use more powerful tools such as H2O or TensorFlow which I can install on my HDF. Right?

Re: Artificial intelligence on Hadoop

Super Guru

@Jan Horton

Please see my replies inline below:

1. TensorFlow and H2O is a standalone tool for machine learning

Yes. But they need data to operate on. That data can be stored in local file system or HDFS.

2. I can install it on HDF to access my big data pool

No. where did HDF came in all this? You can install HDF and write data to Hadoop or even read data, but what's the use case? What are you trying to do?

3. I can use DeepLearning4J, SparkMLLib or other libs, to use machine learning without installing other tools

Well, if you are reading data from local file system, then yes. But you still need Spark cluster (even if its single node) to be able to use SparkMLLib.

Re: Artificial intelligence on Hadoop

Super Guru

@Jan Horton

One more thing I forgot to mention. The first thing to worry about is how much data you have. If you can store data one machine (server), then just use R or Python. But if you need a distributed system because your data is so large, then you look towards storing in Hadoop and use Spark to do build your AI data models.

Re: Artificial intelligence on Hadoop

Explorer

@mqureshi

I see, that I forget to submit my last post 😕

I meant HDP not HDF.

Now I want to have a closer look on the different solutions. Where are the limits and when I have to choose which solution. There is now specified use case. I will learn this topic and your information are a very good starting point to go deeper.

Thank you very much!