Member since
07-18-2014
74
Posts
0
Kudos Received
0
Solutions
11-03-2014
10:26 AM
Sean, The GET Max limit is implementation dependent. So my question more specific is that from "server side" of Oryx, is there such limitation? I guess so, but want to confirm. Thanks.
... View more
11-03-2014
08:57 AM
Sean, Thanks for your reply. Additional (and related) question to the scoring.... I tried to understand the score after giving user id (that is /recommend/ end point) withour re-scoring. Then, I get a list of "item-id", "predict-score/value". I think the score is computed by XY(t) .... And the range is about 0-1. My question is how can we interpret these values (in addition to the ranking)? Assuming a good training (tuned training parameters). Is the score lower than 0.5 implying low possibility that this is user's preference and maybe "random" recommendation is then can be used in the lower 0.5 score case ?
... View more
11-02-2014
08:28 AM
Hi Sean, In the RecommendServlet page, we noticed it stated Responds to a GET request to /recommend/[userID](?howMany=n)(&offset=o)(&considerKnownItems=true|false)(&rescorerParams=...)... We are designing the re-score classes to plugin. For example, pass a list of item-ID to rescorerParams, so that the recommendation only considers the passed item-IDs as recommendation result. The problem is that the item-IDs list could be very long, and it could easily hit maximum length of HTTP GET request. Are there any suggestions? Is there any Http-Post servlet to do the /recommend/ ? Thanks.
... View more
10-25-2014
04:19 PM
Sean, We are investigating the ways to retrieve and modify the latent feature vectors on the fly... One practical case is to resolve "cold start" problem. For example, given a new item without any user-item associations, we want to "approximate" the item's latent vector. An idea is taking the new item and comparing it's similarity (attribute based similarity, not latent feature similarity) to other items. Then, get the k-NN items' latent vectors to approximate the new items' "latent" vector. Basically, it's a k-NN based approach. There could be other approaches. Anyway, let say, the new item's latent vector is estimated somehow. Then, we want to "insert" this new entry to existing item latent vectors (Y matrix). I understand there is no end-point API to do that. However, is it feasible to work around, say, using the Java jar level (computation/serving jar) to getY and then modify Y matrix and "save" it back ? Any suggestions are welcome. Thanks. Jason
... View more
10-10-2014
08:45 AM
Sean, Follow up questions on the model generations.... Let's say, we have data coming to the system about 100,000 associations (user, item, preference) every hour. Based on the reading of the document/code, my understanding on the data flow is that (1) As long as the data arrive, it will be temporarily put into /tmp/Oryx... (2) And, meanwhile the serving layer will approximate the recommendations based on these new data (3) Once the data accmulated to "writes-between-upload" amount, it will be move to inbound. (4) If there is new data in inbound, the computation layer will be triggered to create new generation. Questions: (a) Please correct the above-mentioned data flow. (b) About (4), I am thinking it's related to the time-threshold and data-threshold in the configuration file. However, these two parameters are not clear in the documents. Can you explain more? For example, how can I set the computation to rebuild the model daily? (c) About generations created by computation layer: I think each generation uses a full set of data snapshot at the moment the ALS computation is triggered... So, basically, the data used for generation 00000 is a subset of generation 00001 (if not considering removing data). Thanks.
... View more
10-04-2014
09:30 AM
Is "writes-between-upload" of the file reference.conf controlling this ? Also, additional questions related to model in general... (1) Is it possible to delete a model (say, from API) withoout restarting Oryx ? (2) Relate to (1), is it possible to force to rebuild a model using different parameters on the fly (say, through API or Java call) without restarting Oryx ? Thanks.. Jason
... View more
10-03-2014
05:13 PM
Hi Sean, I tried to import a CSV file (150K data with 3000 entries) from the Oryx UI of /ingest (browse to a file and upload). I can see the data saved into /tmp/Oryx (as .csv.gz file). However; in the server console, the uploading of data seems never happened... The console message is as below... --------------------------------- Fri Oct 03 16:47:53 PDT 2014 INFO Initializing ProtocolHandler ["http-nio-8091"] Fri Oct 03 16:47:53 PDT 2014 INFO Using a shared selector for servlet write/read Fri Oct 03 16:47:53 PDT 2014 INFO Starting service Tomcat Fri Oct 03 16:47:53 PDT 2014 INFO Starting Servlet Engine: Apache Tomcat/7.0.55 Fri Oct 03 16:47:53 PDT 2014 INFO Serving Layer console available at http://192.168.2.6:8091 Fri Oct 03 16:47:53 PDT 2014 WARNING Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Fri Oct 03 16:47:53 PDT 2014 INFO Namespace prefix: file: Fri Oct 03 16:47:53 PDT 2014 INFO No available generation, nothing to do Fri Oct 03 16:47:53 PDT 2014 INFO No available generation, nothing to do Fri Oct 03 16:47:53 PDT 2014 INFO Starting ProtocolHandler ["http-nio-8091"] Fri Oct 03 16:54:53 PDT 2014 INFO No available generation, nothing to do Fri Oct 03 17:01:53 PDT 2014 INFO No available generation, nothing to do Fri Oct 03 17:08:53 PDT 2014 INFO No available generation, nothing to do ---------------------------- When I "Contrl+C" to terminate the process, then I see the data uploading message (as below). Any idea why ? ^CFri Oct 03 17:10:51 PDT 2014 INFO Caught signal INT (2) Fri Oct 03 17:10:51 PDT 2014 INFO Pausing ProtocolHandler ["http-nio-8091"] Fri Oct 03 17:10:52 PDT 2014 INFO Stopping service Tomcat Fri Oct 03 17:10:52 PDT 2014 INFO Uploading /tmp/Oryx/oryx-append-2428716983456732298.csv.gz to /Users/............/00000/inbound/oryx-append-2428716983456732298.csv.gz Fri Oct 03 17:10:52 PDT 2014 INFO Uploaded to /....../00000/inbound/oryx-append-2428716983456732298.csv.gz Fri Oct 03 17:10:52 PDT 2014 INFO Stopping ProtocolHandler ["http-nio-8091"] Fri Oct 03 17:10:53 PDT 2014 INFO Destroying ProtocolHandler ["http-nio-8091"] --------------------------------
... View more
Labels:
- Labels:
-
Apache Hadoop
09-19-2014
10:06 PM
(1) I tried to invoke "/refresh". The http status code indicates 200. But, I have not noticed the Oryx console messages changed to reflect the re-build of the model matrix. Shouldn't I see some ALS iteration messages in the console ? How do I know the model is refreshing ? Can refreshing take "new" ALS parameters to rebuild the model matrix ? (2) Is there a way to clean all trainig data and re-set the model ?
... View more
09-17-2014
08:37 AM
Great news. Thanks a lot for your swift response! Yes, I would love to give a try and will let you know (it may take couple of days). Thanks again!
... View more
- « Previous
- Next »