Community Articles

Find and share helpful community-sourced technical articles.
Celebrating as our community reaches 100,000 members! Thank you!

First you need to have Rapidminer downloaded and installed on your machine.

Once installed open Rapidminer and look at the list of operators.

There is a link at the bottom left to "Get More Operators"


Click the link then search for "Radoop"

Select both packages and click install.

After Rapidminer restarts you will see in the extensions folder the new operators we downloaded.


Now we need to configure the connection.

In the toolbar select "connections" then "Manage Radoop Connections"


Select "+ New Connection".

If you have your Hadoop config files available you can use those to set the properties otherwise select "manual"

Select the hadoop version you have. In my case "Hortonworks 2.x"

and supply the master url... If you have multiple masters select the check box and provide the details.

Click "OK"

Now click ">> Quick Test"

If successful you are all set to read from Hive.

Drag an "Radoop Nest" operator onto the canvas.



Select the operator on the canvas and on the right hand side of the IDE select the connection we created earlier.


Now double click the Radoop Nest operator to enter the nested canvas.

Drag a "Retrieve from Hive" operator into the canvas, located in Radoop-->Data Access

Click the operator and select a table that you wish to select.


Connect the out port of the operator to the out port on the edge of the canvas by dragging from one to the other.

Now click the Play button and wait for it to complete.

Click the out port and select show sample data.


Hope this was helpful!

More to come on Rapidminer + Hortonworks ...