- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Run Oryx on a machine that is not part of the cluster
- Labels:
-
Apache Hadoop
-
HDFS
Created on ‎12-22-2014 06:00 PM - edited ‎09-16-2022 08:39 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am trying to run Oryx on a machine that is not part of the cluster...
My setting for the oryx.conf is as below (about the Hadoop/HDFS settings)... Is that a right setting ?
Is there something else I need to set for the oryx.conf
Created ‎12-23-2014 12:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That's fine. The machine needs to be able to communicate with the cluster of course. Usually you would make the Hadoop configuration visible as well and point to it with HADOOP_CONF_DIR. I think that will be required to get MapReduce to work.
Created ‎08-23-2015 12:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Right, I forgot to mention that part: you need the cluster's binaries too, like ZK, HDFS, YARN, Spark, etc. It is using the cluster's distribution.
As you can see, it's definitely intended to be run on a cluster edge node, so I'd strongly suggest running it that way.
Created ‎08-23-2015 07:29 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You have to copy /opt/cloudera/CDH/jars , /etc/hadoop from a node of cluster to your machine runing oryx2.
I had tried a few ways to run it outside the cluster, but all failed.
The node running oryx2 had to be runed inside cluster.
My conclusion is that , CDH maybe requrie the same parcels version and cloudera agent on node to use the cluster resources.
Created ‎08-23-2015 09:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There shouldn't be any other dependencies. If the error is like what you showed before, it's just firewall/port config problems.
Created ‎08-24-2015 12:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I used to deploy hadoop, spark and so on by extracting source tarballs. Forturnately, edge node seems to be a good idea to acess cluster resources.

- « Previous
- Next »