About srowen

alrocks · ‎09-14-2015

Thanks Sean; you're the best!

horatio · ‎08-24-2015

Actually, I don't know the exact reasons and had stuck in this problem for a few day with firewalls on all machines disabled at very first. I used to deploy hadoop, spark and so on by extracting source tarballs. Forturnately, edge node seems to be a good idea to acess cluster resources.

Saeed.Barghi · ‎07-22-2015

Hi, Can you please post the exact order of your imports? I am having the same issue. Thanks in advance.

srowen · ‎06-14-2015

This concerns version 1.x by the way. The config elements in question are here: https://github.com/cloudera/oryx/blob/master/common/src/main/resources/reference.conf#L136

srowen · ‎04-10-2015

This is just Java's locking library, it's not specific to the project. This is a lock that supports many readers at one time, but, only one writer at a time (and no readers while a writer has the write lock). You have to acquire the write lock to mutate the shared state, but also need to acquire the read lock to read it -- but, you won't exclude other readers.

srowen · ‎02-28-2015

Have you set it to start a generation based on the amount of input received? that could be triggering the new computation. That said are you sure it only has part of the input? it's possible the zipped file sizes aren't that comparable. Yes, you simply don't have enough memory allocated to your JVM. Your system memory doesn't matter if you haven't let the JVM use much of it. This is in local mode right? you need to use -Xmx to give more heap. Yes it will use different tmp directories for different jobs. That's normal.

tarekabouzeid91 · ‎02-09-2015

can you please post the specific dependencies you added ?

gmortiz17 · ‎01-23-2015

The problem was indeed in the packaging. I fixed it by including the maven-assembly-plugin. <plugin> <artifactId>maven-assembly-plugin</artifactId> <version>2.5.3</version> <configuration> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> </configuration> <executions> <execution> <id>make-assembly</id>  <phase>package</phase> <goals> <goal>single</goal> </goals> </execution> </executions> </plugin>

srowen · ‎12-14-2014

hm, yes that should not be how it works. If it hasn't decayed or been removed it will stick around forever. If you reach 0 the user-item pair will be removed. Negative values still have a meaning so are not removed so it's a question of how small the absolute value is. Yes, these are removed, including users that don't have any items.

christinan · ‎12-11-2014

Hi Sean Just to let you know the outcome of this, all of my tests yesterday with Hadoop, with various parameters, on the one month of searches dataset, went on fine. I will not continue testing this further on the whole big dataset, as for the moment it looks like Hadoop is out of the picture, since I managed to get hold of a machine with 512GB of RAM which prooved up to the challange of running Oryx in memory. The dataset is 421MB, with roughly 20 million records, and it took just a few minutes to go through 29 iterations, so well done! Seemed like a big portion of time was spent writing the model (this is an SSD machine). (I will continue further by looking at recommendations response times, how's that affected when I ingest users etc etc) Thank you for the help with the bugs and all the explanations along the way.

Online	Offline
Last Visited	‎02-06-2015 02:06 PM

Member Since	‎07-29-2013 08:58 AM
Last Visited	‎02-06-2015 02:06 PM
Posts	366
Kudos received	62

Cloudera Community

Re: CDH 5.6

Re: How to use Oryx 1 to detect spam email

Re: Spark program in eclipse

Re: Graphx in latest CDH

Re: Maturity ORYX

Re: Cloudera Spark 1.4 Support in CDH 5.4

Re: Run Oryx on a machine that is not part of the ...

Re: error: object sql is not a member of package o...

Re: Oryx: API method unavailable until model has b...

Re: Retrieve and modify latent feature vectors on ...

Re: Questions on several API end points and model

Re: org.apache.spark.examples.streaming.JavaKafkaW...

Re: java.lang.NoClassDefFoundError: org/json/simpl...

Re: Sliding windows and generations

Re: Oryx ALS: Hadoop computation yields MAP 0.00x,...