Created on 12-22-2014 06:00 PM - edited 09-16-2022 08:39 AM
Hi,
I am trying to run Oryx on a machine that is not part of the cluster...
My setting for the oryx.conf is as below (about the Hadoop/HDFS settings)... Is that a right setting ?
Is there something else I need to set for the oryx.conf
Created 12-23-2014 12:47 AM
That's fine. The machine needs to be able to communicate with the cluster of course. Usually you would make the Hadoop configuration visible as well and point to it with HADOOP_CONF_DIR. I think that will be required to get MapReduce to work.
Created 08-23-2015 12:19 AM
Right, I forgot to mention that part: you need the cluster's binaries too, like ZK, HDFS, YARN, Spark, etc. It is using the cluster's distribution.
As you can see, it's definitely intended to be run on a cluster edge node, so I'd strongly suggest running it that way.
Created 08-23-2015 07:29 PM
You have to copy /opt/cloudera/CDH/jars , /etc/hadoop from a node of cluster to your machine runing oryx2.
I had tried a few ways to run it outside the cluster, but all failed.
The node running oryx2 had to be runed inside cluster.
My conclusion is that , CDH maybe requrie the same parcels version and cloudera agent on node to use the cluster resources.
Created 08-23-2015 09:50 PM
There shouldn't be any other dependencies. If the error is like what you showed before, it's just firewall/port config problems.
Created 08-24-2015 12:00 AM