Right, I forgot to mention that part: you need the cluster's binaries too, like ZK, HDFS, YARN, Spark, etc. It is using the cluster's distribution.
As you can see, it's definitely intended to be run on a cluster edge node, so I'd strongly suggest running it that way.
You have to copy /opt/cloudera/CDH/jars , /etc/hadoop from a node of cluster to your machine runing oryx2.
I had tried a few ways to run it outside the cluster, but all failed.
The node running oryx2 had to be runed inside cluster.
My conclusion is that , CDH maybe requrie the same parcels version and cloudera agent on node to use the cluster resources.
There shouldn't be any other dependencies. If the error is like what you showed before, it's just firewall/port config problems.