Member since
11-27-2014
32
Posts
0
Kudos Received
0
Solutions
03-31-2015
02:12 AM
1 Kudo
No, there's no way to do this. You could do it quite manually by creating a job that writes new, merged data and then puts it back into place where the old data was.
... View more
12-14-2014
01:33 PM
hm, yes that should not be how it works. If it hasn't decayed or been removed it will stick around forever. If you reach 0 the user-item pair will be removed. Negative values still have a meaning so are not removed so it's a question of how small the absolute value is. Yes, these are removed, including users that don't have any items.
... View more
12-11-2014
10:01 AM
Hi Sean Just to let you know the outcome of this, all of my tests yesterday with Hadoop, with various parameters, on the one month of searches dataset, went on fine. I will not continue testing this further on the whole big dataset, as for the moment it looks like Hadoop is out of the picture, since I managed to get hold of a machine with 512GB of RAM which prooved up to the challange of running Oryx in memory. The dataset is 421MB, with roughly 20 million records, and it took just a few minutes to go through 29 iterations, so well done! Seemed like a big portion of time was spent writing the model (this is an SSD machine). (I will continue further by looking at recommendations response times, how's that affected when I ingest users etc etc) Thank you for the help with the bugs and all the explanations along the way.
... View more
11-28-2014
08:27 AM
Hadoop still has config files for sure. They can end up wherever you want them to. I though they're still at $HADOOP_HOME/conf in the vanilla Hadoop tarball, but I took a look at 2.5.2 and it's at$HADOOP_HOME/etc/hadoop in fact. In any event if they're at /usr/local/hadoop/etc/hadoop in your installation, then that's what you set $HADOOP_CONF_DIR to. Just wherever they really are. This is one of Hadoop's standard environment variables. If you're up and running then this is working. Yes that sounds like about what you do to install snappy. They are libs that should be present on the cluster machines.
... View more