Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

My first Oryx run failed with many Java errors

My first Oryx run failed with many Java errors

Explorer

Hi 

 

I having an issue in my first run of Oryx's Computation and Servicing jobs.

I am appreciating for any suggestions to pinpoint where I did gone wrong.

 

Please see below for Java's errors from my first run.

The first two Java errors of the Computational job:

<snip>

$ java -Dconfig.file=./nem-dms/oryx.conf -jar computation/target/oryx-computation-0.5.0-SNAPSHOT.jar

Mar 28, 2014 9:44:14 AM com.cloudera.oryx.computation.PeriodicRunner run
SEVERE: Unexpected error in execution
java.lang.ExceptionInInitializerError
at com.cloudera.oryx.common.servcomp.Store.<init>(Store.java:76)
at com.cloudera.oryx.common.servcomp.Store.<clinit>(Store.java:57)
at com.cloudera.oryx.common.servcomp.StoreUtils.listGenerationsForInstance(StoreUtils.java:50)
at com.cloudera.oryx.computation.PeriodicRunner.run(PeriodicRunner.java:173)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NullPointerException: No host?
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
at com.cloudera.oryx.common.servcomp.Namespaces.<init>(Namespaces.java:60)
at com.cloudera.oryx.common.servcomp.Namespaces.<clinit>(Namespaces.java:42)
... 13 more
Mar 28, 2014 9:45:14 AM com.cloudera.oryx.computation.PeriodicRunner run
SEVERE: Unexpected error in execution
java.lang.NoClassDefFoundError: Could not initialize class com.cloudera.oryx.common.servcomp.Store
at com.cloudera.oryx.common.servcomp.StoreUtils.listGenerationsForInstance(StoreUtils.java:50)
at com.cloudera.oryx.computation.PeriodicRunner.run(PeriodicRunner.java:173)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Mar 28, 2014 9:46:14 AM com.cloudera.oryx.computation.PeriodicRunner run

</snip>

 

The first two Java errors of the Servicing job:

<snip>

$ sudo java -Dconfig.file=./nem-dms/oryx.conf -jar serving/target/oryx-serving-0.5.0-SNAPSHOT.jar

Mar 28, 2014 9:45:43 AM org.apache.catalina.core.StandardContext listenerStart
SEVERE: Exception sending context initialized event to listener instance of class com.cloudera.oryx.kmeans.serving.web.KMeansServingInitListener
java.lang.ExceptionInInitializerError
at com.cloudera.oryx.common.servcomp.Store.<init>(Store.java:76)
at com.cloudera.oryx.common.servcomp.Store.<clinit>(Store.java:57)
at com.cloudera.oryx.common.servcomp.StoreUtils.listGenerationsForInstance(StoreUtils.java:50)
at com.cloudera.oryx.serving.generation.GenerationManager.getMostRecentGeneration(GenerationManager.java:227)
at com.cloudera.oryx.serving.generation.GenerationManager.maybeRollAppender(GenerationManager.java:209)
at com.cloudera.oryx.serving.generation.GenerationManager.<init>(GenerationManager.java:91)
at com.cloudera.oryx.kmeans.serving.generation.KMeansGenerationManager.<init>(KMeansGenerationManager.java:42)
at com.cloudera.oryx.kmeans.serving.web.KMeansServingInitListener.contextInitialized(KMeansServingInitListener.java:45)
at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4973)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5467)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1559)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1549)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NullPointerException: No host?
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
at com.cloudera.oryx.common.servcomp.Namespaces.<init>(Namespaces.java:60)
at com.cloudera.oryx.common.servcomp.Namespaces.<clinit>(Namespaces.java:42)
... 18 more
Mar 28, 2014 9:45:43 AM org.apache.catalina.core.StandardContext startInternal
SEVERE: Error listenerStart
Mar 28, 2014 9:45:43 AM org.apache.catalina.core.StandardContext startInternal
SEVERE: Context [] startup failed due to previous errors
Mar 28, 2014 9:45:43 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [] appears to have started a thread named [pool-1-thread-1] but has failed to stop it. This is very likely to create a memory leak.

</snip>

 

 

I have Hadoop 2.2.0-cdh5.0.0-beta-1 with the following tools:

<snip>

$ hadoop version
Hadoop 2.2.0-cdh5.0.0-beta-1
Subversion git://github.sf.cloudera.com/CDH/cdh.git -r ee825cb06b23d3ab97cdd87e13cbbb630bd75b98
Compiled by jenkins on 2013-10-28T00:09Z
Compiled with protoc 2.5.0
From source with checksum 5c3bad5ba6f8a4e3d6e68b4c706f3f
This command was run using /usr/lib/hadoop/hadoop-common-2.2.0-cdh5.0.0-beta-1.jar

 

$ mvn -version
Apache Maven 3.2.1 (ea8b2b07643dbb1b84b6d16e1f08391b666bc1e9; 2014-02-15T04:37:52+10:00)
Maven home: /usr/local/apache-maven/apache-maven-3.2.1
Java version: 1.6.0_32, vendor: Sun Microsystems Inc.
Java home: /usr/java/jdk1.6.0_32/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "2.6.18-274.el5", arch: "amd64", family: "unix"

</snip>

 

Here are steps I follow to build Oryx using Maven:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Install Maven
cd <DownLoad Directory>
wget http://apache.mirror.uber.com.au/maven/maven-3/3.2.1/binaries/apache-maven-3.2.1-bin.tar.gz
mkdir /usr/local/apache-maven
cd /usr/local/apache-maven
tar xvf <DownLoad Directory>/apache-maven-3.2.1-bin.tar.gz
export M2_HOME=/usr/local/apache-maven/apache-maven-3.2.1
export MAVEN_OPTS="-Xms256m -Xmx512m"
export M2=$M2_HOME/bin
export PATH=$M2:$PATH
mvn -version

Install Cloudera's oryx
mkdir <INSTALL DIR>
sudo git clone https://github.com/cloudera/oryx.git
cd oryx
edit $HOME/.m2/settings.xml
Add:
<settings>
<proxies>
<proxy>
<active>true</active>
<protocol>http</protocol>
<host>58.162.7.28</host>
<port>3128</port>
</proxy>
</proxies>
</settings>

 

mvn -Phadoop220+ -DskipTests install

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

The last lines of output

........

[INFO] Installing /ora/db002/stg001/BDMSL1D/hadoop/nem-dms/devices/oryx/serving/target/oryx-serving-0.5.0-SNAPSHOT.jar to /home/oracle/.m2/repository/com/cloudera/oryx/oryx-serving/0.5.0-SNAPSHOT/oryx-serving-0.5.0-SNAPSHOT.jar
[INFO] Installing /ora/db002/stg001/BDMSL1D/hadoop/nem-dms/devices/oryx/serving/dependency-reduced-pom.xml to /home/oracle/.m2/repository/com/cloudera/oryx/oryx-serving/0.5.0-SNAPSHOT/oryx-serving-0.5.0-SNAPSHOT.pom
[INFO] Installing /ora/db002/stg001/BDMSL1D/hadoop/nem-dms/devices/oryx/serving/target/oryx-serving-0.5.0-SNAPSHOT-javadoc.jar to /home/oracle/.m2/repository/com/cloudera/oryx/oryx-serving/0.5.0-SNAPSHOT/oryx-serving-0.5.0-SNAPSHOT-javadoc.jar
[INFO] Installing /ora/db002/stg001/BDMSL1D/hadoop/nem-dms/devices/oryx/serving/target/oryx-serving-0.5.0-SNAPSHOT-sources.jar to /home/oracle/.m2/repository/com/cloudera/oryx/oryx-serving/0.5.0-SNAPSHOT/oryx-serving-0.5.0-SNAPSHOT-sources.jar
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Oryx .............................................. SUCCESS [ 1.256 s]
[INFO] Oryx Common ....................................... SUCCESS [01:51 min]
[INFO] Oryx Common for Serving and Computation ........... SUCCESS [01:51 min]
[INFO] Oryx Alternating Least Squares Common ............. SUCCESS [01:02 min]
[INFO] Oryx Serving Layer Common ......................... SUCCESS [ 22.304 s]
[INFO] Oryx Alternating Least Squares Serving ............ SUCCESS [01:04 min]
[INFO] Oryx Computation Layer Common ..................... SUCCESS [01:46 min]
[INFO] Oryx Alternating Least Squares Computation ........ SUCCESS [02:05 min]
[INFO] Oryx K-Means Common ............................... SUCCESS [ 36.054 s]
[INFO] Oryx K-Means Serving .............................. SUCCESS [ 40.565 s]
[INFO] Oryx K-Means Computation .......................... SUCCESS [02:56 min]
[INFO] Oryx Random Decision Forests Common ............... SUCCESS [01:01 min]
[INFO] Oryx Random Decision Forests Serving .............. SUCCESS [ 39.629 s]
[INFO] Oryx Random Decision Forests Computation .......... SUCCESS [ 30.368 s]
[INFO] Oryx Computation Layer ............................ SUCCESS [ 56.877 s]
[INFO] Oryx Serving Layer ................................ SUCCESS [ 49.872 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 18:18 min
[INFO] Finished at: 2014-03-28T09:41:17+10:00
[INFO] Final Memory: 39M/248M
[INFO] ------------------------------------------------------------------------

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 

 

Regards,

Truong

12 REPLIES 12

Re: My first Oryx run failed with many Java errors

Master Collaborator

This almost certainly is because there is something slightly wrong with the config file. Can you post that?

 

You don't have to build it by the way. You could skip all that and download the binary on Github.

Re: My first Oryx run failed with many Java errors

Explorer
Hi Sean,

Which config file?

Re: My first Oryx run failed with many Java errors

Master Collaborator

./nem-dms/oryx.conf !

Re: My first Oryx run failed with many Java errors

Explorer
Here is the content of the oryx.conf :
$ cat nem-dms/oryx.conf
model=${kmeans-model}
model.instance-dir=/data/db/bdms1p/oryx/example/00000/inbound
serving-layer.api.port=8091
computation-layer.api.port=8092
model.k=[1, 5, 10]
model.replications=2
model.local-computation=true
inbound.column-names=[LAN_Host_Status,FarEndInterarrivalJitter,OutgoingCallsAttempted,OutgoingCallsFailed,Overruns,PacketsLost,PacketsReceived,PacketsSent,TotalCallTime,Underruns,ActiveConnectionServiceID,WANDevice2TotalBytesReceived,WANDevice2TotalBytesSent,WANPPPConnectionUptime,WANDevice3TotalBytesReceived,WANDevice3TotalBytesSent]

Re: My first Oryx run failed with many Java errors

Explorer
My data is in the HDFS:
$ sudo -u hdfs hadoop fs -ls -R /data/db/bdms1p/oryx/example/00000/inbound
-rw-r--r-- 2 hdfs supergroup 19253 2014-03-27 15:42 /data/db/bdms1p/oryx/example/00000/inbound/part-00000

There are 231 lines in my input data. Here are the first few lines of the input file:
$ sudo -u hdfs hadoop fs -cat /data/db/bdms1p/oryx/example/00000/inbound/part-00000
312441, 0, 706, 0, 0, 0, 780, 794, 16, 0, 0, 2087269213, 1761623061, 0, 0, 0
174175, 38, 327, 0, 0, 0, 10474, 10550, 211, 0, 0, 3506590468, 232641856, 0, 0, 0
389455, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2242969916, 297061609, 0, 0, 0
139972, 152, 1266, 0, 0, 0, 171581, 171656, 3433, 0, 0, 1136171940, 71463589, 0, 0, 0
322595, 19, 818, 0, 0, 0, 588, 599, 12, 0, 0, 675303271, 690264334, 0, 0, 0

Re: My first Oryx run failed with many Java errors

Master Collaborator

This looks good except for two things. First is instance-dir, which should be:

 

model.instance-dir=/data/db/bdms1p/oryx/example

 

Second is that you need to set

 

model.local-data=true

 

It is trying to read HDFS and failing. I can add a better error message for that, as this is a common cause.

Re: My first Oryx run failed with many Java errors

Explorer

Hi Sean,

Thanks for the review and suggestions.
I have managed to run my first test using the local filesystem dataset but I still could not figured out why it did not run under HDFS (see below for the HDFS error messages).
Yay!

Here are my findings and understanding so far for implementing a Clustering model using a local filesystem.
The parameters:
# Set these parameters to TRUE to run on the local filesystem. By setting the parameter (local-computation = false) will instruct the Computational process to run from HDFS.
model.local-data=true
model.local-computation=true
# Location where Oryx Computational process will read and create files for the current computation.
model.instance-dir=<Full Path Name of the local filesystem>
# The training dataset must be put into <Full Path Name of the local filesystem>/00000/inbound/*.csv
# Need to use either the parameters (inbound.numeric-columns or inbound.categorical-columns) to specify some or all column types
inbound.column-names=[column names]
inbound.numeric-columns=[column names of the numeric type]
inbound.categorical-columns=[column names of the categorical type]


Questions:
1) It seems that the Servicing console only allow to validate test dataset one at a time. Do you have a mechanismn where I can test all my test dataset?

2) Do you have any Visualisation tools (like ELKI) to see the clustering of the dataset population ?

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
RUN from the HDFS: /data/db/bdms1p/oryx/example
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
$ sudo -u hdfs hadoop fs -ls -h -R /data/db/bdms1p/oryx/example
drwxrwxrwx - hdfs supergroup 0 2014-03-31 10:21 /data/db/bdms1p/oryx/example/00000
drwxrwxrwx - hdfs supergroup 0 2014-03-31 10:30 /data/db/bdms1p/oryx/example/00000/inbound
-rwxrwxrwx 2 hdfs supergroup 11.4 K 2014-03-31 10:30 /data/db/bdms1p/oryx/example/00000/inbound/stats-trainingset.csv
-rwxrwxrwx 2 hdfs supergroup 11.4 K 2014-03-31 10:22 /data/db/bdms1p/oryx/example/stats-trainingset.csv
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
$ cat nem-dms/oryx.conf
model=${kmeans-model}
model.instance-dir=/data/db/bdms1p/oryx/example
serving-layer.api.port=8091
computation-layer.api.port=8092
model.k=[1, 5, 10]
model.replications=2
model.local-computation=false
inbound.column-names=[LAN_Host_Status,FarEndInterarrivalJitter,OutgoingCallsAttempted,OutgoingCallsFailed,Overruns,PacketsLost,PacketsReceived,PacketsSent,TotalCallTime,Underruns,ActiveConnectionServiceID,WANDevice2TotalBytesReceived,WANDevice2TotalBytesSent,WANPPPConnectionUptime,WANDevice3TotalBytesReceived,WANDevice3TotalBytesSent]
inbound.numeric-columns=[LAN_Host_Status,FarEndInterarrivalJitter,OutgoingCallsAttempted,OutgoingCallsFailed,Overruns,PacketsLost,PacketsReceived,PacketsSent,TotalCallTime,Underruns,ActiveConnectionServiceID,WANDevice2TotalBytesReceived,WANDevice2TotalBytesSent,WANPPPConnectionUptime,WANDevice3TotalBytesReceived,WANDevice3TotalBytesSent]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

$ sudo java -Dconfig.file=./nem-dms/oryx.conf -jar computation/target/oryx-computation-0.5.0-SNAPSHOT.jar
Mar 31, 2014 12:06:29 PM org.apache.coyote.AbstractProtocol init
INFO: Initializing ProtocolHandler ["http-nio-8092"]
Mar 31, 2014 12:06:29 PM org.apache.tomcat.util.net.NioSelectorPool getSharedSelector
INFO: Using a shared selector for servlet write/read
Mar 31, 2014 12:06:29 PM org.apache.catalina.core.StandardService startInternal
INFO: Starting service Tomcat
Mar 31, 2014 12:06:29 PM org.apache.catalina.core.StandardEngine startInternal
INFO: Starting Servlet Engine: Apache Tomcat/7.0.52
Mar 31, 2014 12:06:29 PM com.cloudera.oryx.computation.web.ComputationInitListener contextInitialized
INFO: Computation Layer console available at http://bpdevdmsdbs01:8092
Mar 31, 2014 12:06:29 PM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler ["http-nio-8092"]
Mar 31, 2014 12:06:30 PM org.apache.hadoop.util.NativeCodeLoader <clinit>
WARNING: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Mar 31, 2014 12:06:30 PM com.cloudera.oryx.common.servcomp.Namespaces <init>
INFO: Namespace prefix: hdfs://nsda3dmsrpt02.internal.bigpond.com:8020
Mar 31, 2014 12:06:32 PM com.cloudera.oryx.computation.PeriodicRunner run
WARNING: Unexpected error in execution
java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: callId, status; Host Details : local host is: "bpdevdmsdbs01/172.18.127.245"; destination host is: "nsda3dmsrpt02.internal.bigpond.com":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:763)
at org.apache.hadoop.ipc.Client.call(Client.java:1242)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy9.getFileInfo(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:629)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1545)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:820)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1378)
at com.cloudera.oryx.common.servcomp.Store.list(Store.java:424)
at com.cloudera.oryx.common.servcomp.StoreUtils.listGenerationsForInstance(StoreUtils.java:50)
at com.cloudera.oryx.computation.PeriodicRunner.run(PeriodicRunner.java:173)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: callId, status
at com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:81)
at org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto$Builder.buildParsed(RpcPayloadHeaderProtos.java:1094)
at org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto$Builder.access$1300(RpcPayloadHeaderProtos.java:1028)
at org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:986)
at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:949)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:847)
Mar 31, 2014 12:06:54 PM com.cloudera.oryx.common.signal.SignalManagerPOSIXImpl$1 handle
INFO: Caught signal INT (2)
Mar 31, 2014 12:06:54 PM org.apache.coyote.AbstractProtocol pause
INFO: Pausing ProtocolHandler ["http-nio-8092"]
Mar 31, 2014 12:06:54 PM org.apache.catalina.core.StandardService stopInternal
INFO: Stopping service Tomcat
Mar 31, 2014 12:06:54 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [] appears to have started a thread named [IPC Parameter Sending Thread #0] but has failed to stop it. This is very likely to create a memory leak.
Mar 31, 2014 12:06:54 PM org.apache.coyote.AbstractProtocol stop
INFO: Stopping ProtocolHandler ["http-nio-8092"]
Mar 31, 2014 12:06:55 PM org.apache.coyote.AbstractProtocol destroy
INFO: Destroying ProtocolHandler ["http-nio-8092"]

Re: My first Oryx run failed with many Java errors

Master Collaborator

Your overview is correct. The error at the end comes beacuse you've got a version built for a different version of Hadoop than you're using. Are you using CDH5? You need the binaries labeled "hadoop220." at https://github.com/cloudera/oryx/releases/tag/oryx-0.4.1 or build with the "-Phadoop220+" profile.

 

What do you mean about validating a tes set? There is only one model per instance to begin with.

No visualization tools, no. This is just the prediction engine.

Re: My first Oryx run failed with many Java errors

Explorer

Hi Sean,

 

I have a fresh rebuild of the Oryx v0.9 using the source as per your instruction.

The local computation is working fine but I still have an issue with the HDFS computation.

Please see  below for the error and my conf for non-local computation.

 

With the local computation, Oryx did not complain about the missing property "id-columns" but it will for the non-local computation.

 

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 

$ cat nem-dms/oryx.conf
model=${kmeans-model}
model.instance-dir=/data/db/bdms1p/oryx/example
serving-layer.api.port=8091
computation-layer.api.port=8092
model.k=[1, 5, 10]
model.replications=2
model.local-computation=false
inbound.column-names=[LAN_Host_Status,FarEndInterarrivalJitter,OutgoingCallsAttempted,OutgoingCallsFailed,Overruns,PacketsLost,PacketsReceived,PacketsSent,TotalCallTime,Underruns,ActiveConnectionServiceID,WANDevice2TotalBytesReceived,WANDevice2TotalBytesSent,WANPPPConnectionUptime,WANDevice3TotalBytesReceived,WANDevice3TotalBytesSent]
inbound.numeric-columns=[LAN_Host_Status,FarEndInterarrivalJitter,OutgoingCallsAttempted,OutgoingCallsFailed,Overruns,PacketsLost,PacketsReceived,PacketsSent,TotalCallTime,Underruns,ActiveConnectionServiceID,WANDevice2TotalBytesReceived,WANDevice2TotalBytesSent,WANPPPConnectionUptime,WANDevice3TotalBytesReceived,WANDevice3TotalBytesSent]

 

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

$ sudo java -Dconfig.file=./nem-dms/oryx.conf -jar $MAHOUT_HOME/computation/target/oryx-computation-0.5.0-SNAPSHOT.jar
Apr 2, 2014 1:14:07 PM org.apache.coyote.AbstractProtocol init
INFO: Initializing ProtocolHandler ["http-nio-8092"]
Apr 2, 2014 1:14:07 PM org.apache.tomcat.util.net.NioSelectorPool getSharedSelector
INFO: Using a shared selector for servlet write/read
Apr 2, 2014 1:14:07 PM org.apache.catalina.core.StandardService startInternal
INFO: Starting service Tomcat
Apr 2, 2014 1:14:07 PM org.apache.catalina.core.StandardEngine startInternal
INFO: Starting Servlet Engine: Apache Tomcat/7.0.52
Apr 2, 2014 1:14:07 PM com.cloudera.oryx.computation.web.ComputationInitListener contextInitialized
INFO: Computation Layer console available at http://bpdevdmsdbs01:8092
Apr 2, 2014 1:14:07 PM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler ["http-nio-8092"]
Apr 2, 2014 1:14:08 PM org.apache.hadoop.util.NativeCodeLoader <clinit>
WARNING: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Apr 2, 2014 1:14:08 PM com.cloudera.oryx.common.servcomp.Namespaces <init>
INFO: Namespace prefix: hdfs://nsda3dmsrpt02.internal.bigpond.com:8020
Apr 2, 2014 1:14:09 PM com.cloudera.oryx.computation.PeriodicRunner run
INFO: Forcing run -- no complete generations yet
Apr 2, 2014 1:14:10 PM org.apache.hadoop.yarn.client.RMProxy createRMProxy
INFO: Connecting to ResourceManager at bpdevdmsdbs01/172.18.127.245:8032
Apr 2, 2014 1:14:10 PM com.cloudera.oryx.computation.common.GenerationRunner call
INFO: Starting run for instance /data/db/bdms1p/oryx/example
Apr 2, 2014 1:14:10 PM com.cloudera.oryx.computation.common.GenerationRunner runGeneration
INFO: No complete generations
Apr 2, 2014 1:14:10 PM com.cloudera.oryx.computation.common.GenerationRunner runGeneration
INFO: Making new generation 1
Apr 2, 2014 1:14:10 PM com.cloudera.oryx.computation.common.GenerationRunner maybeWaitToRun
INFO: Generation 0 may run immediately
Apr 2, 2014 1:14:10 PM com.cloudera.oryx.computation.common.GenerationRunner runGeneration
INFO: Running generation 0
Apr 2, 2014 1:14:10 PM com.cloudera.oryx.kmeans.computation.KMeansDistributedGenerationRunner doPre
SEVERE: id-column(s) value must be specified if model.outliers.compute is enabled
Apr 2, 2014 1:14:10 PM com.cloudera.oryx.computation.PeriodicRunner run
SEVERE: Unexpected error in execution
java.lang.IllegalStateException: Invalid k-means configuration: id-column(s) value must be specified if model.outliers.compute is enabled
at com.cloudera.oryx.kmeans.computation.KMeansDistributedGenerationRunner.doPre(KMeansDistributedGenerationRunner.java:53)
at com.cloudera.oryx.computation.common.DistributedGenerationRunner.runSteps(DistributedGenerationRunner.java:88)
at com.cloudera.oryx.computation.common.GenerationRunner.runGeneration(GenerationRunner.java:234)
at com.cloudera.oryx.computation.common.GenerationRunner.call(GenerationRunner.java:110)
at com.cloudera.oryx.computation.PeriodicRunner.run(PeriodicRunner.java:214)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Apr 2, 2014 1:15:10 PM com.cloudera.oryx.computation.PeriodicRunner run
INFO: Forcing run -- no complete generations yet
Apr 2, 2014 1:15:10 PM org.apache.hadoop.yarn.client.RMProxy createRMProxy
INFO: Connecting to ResourceManager at bpdevdmsdbs01/172.18.127.245:8032
Apr 2, 2014 1:15:11 PM com.cloudera.oryx.computation.common.GenerationRunner call
INFO: Starting run for instance /data/db/bdms1p/oryx/example
Apr 2, 2014 1:15:11 PM com.cloudera.oryx.computation.common.GenerationRunner runGeneration
INFO: No complete generations
Apr 2, 2014 1:15:11 PM com.cloudera.oryx.computation.common.GenerationRunner runGeneration
INFO: No need to make a new generation
Apr 2, 2014 1:15:11 PM com.cloudera.oryx.computation.common.GenerationRunner maybeWaitToRun
INFO: Generation 0 may run immediately
Apr 2, 2014 1:15:11 PM com.cloudera.oryx.computation.common.GenerationRunner runGeneration
INFO: Running generation 0
Apr 2, 2014 1:15:11 PM com.cloudera.oryx.kmeans.computation.KMeansDistributedGenerationRunner doPre
SEVERE: id-column(s) value must be specified if model.outliers.compute is enabled
Apr 2, 2014 1:15:11 PM com.cloudera.oryx.computation.PeriodicRunner run
SEVERE: Unexpected error in execution
java.lang.IllegalStateException: Invalid k-means configuration: id-column(s) value must be specified if model.outliers.compute is enabled
at com.cloudera.oryx.kmeans.computation.KMeansDistributedGenerationRunner.doPre(KMeansDistributedGenerationRunner.java:53)
at com.cloudera.oryx.computation.common.DistributedGenerationRunner.runSteps(DistributedGenerationRunner.java:88)
at com.cloudera.oryx.computation.common.GenerationRunner.runGeneration(GenerationRunner.java:234)
at com.cloudera.oryx.computation.common.GenerationRunner.call(GenerationRunner.java:110)
at com.cloudera.oryx.computation.PeriodicRunner.run(PeriodicRunner.java:214)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Apr 2, 2014 1:15:19 PM com.cloudera.oryx.common.signal.SignalManagerPOSIXImpl$1 handle
INFO: Caught signal INT (2)
Apr 2, 2014 1:15:19 PM org.apache.coyote.AbstractProtocol pause
INFO: Pausing ProtocolHandler ["http-nio-8092"]
Apr 2, 2014 1:15:19 PM org.apache.catalina.core.StandardService stopInternal
INFO: Stopping service Tomcat
Apr 2, 2014 1:15:19 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [] appears to have started a thread named [IPC Client (81955115) connection to nsda3dmsrpt02.internal.bigpond.com/172.18.126.99:8020 from root] but has failed to stop it. This is very likely to create a memory leak.
Apr 2, 2014 1:15:19 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [] appears to have started a thread named [IPC Parameter Sending Thread #1] but has failed to stop it. This is very likely to create a memory leak.
Apr 2, 2014 1:15:19 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [] appears to have started a thread named [IPC Client (81955115) connection to bpdevdmsdbs01/172.18.127.245:8032 from root] but has failed to stop it. This is very likely to create a memory leak.
Apr 2, 2014 1:15:20 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [] appears to have started a thread named [LeaseRenewer:root@nsda3dmsrpt02.internal.bigpond.com:8020] but has failed to stop it. This is very likely to create a memory leak.
Apr 2, 2014 1:15:20 PM org.apache.coyote.AbstractProtocol stop
INFO: Stopping ProtocolHandler ["http-nio-8092"]
Apr 2, 2014 1:15:21 PM org.apache.coyote.AbstractProtocol destroy
INFO: Destroying ProtocolHandler ["http-nio-8092"]
oracle@bpdevdmsdbs01:BDMSSI1D1 ---- /ora/db002/stg001/BDMSL1D/hadoop/nem-dms/devices/oryx -----
$

Don't have an account?
Coming from Hortonworks? Activate your account here