Member since
09-17-2013
63
Posts
5
Kudos Received
0
Solutions
07-20-2015
01:36 PM
I have mapreduce code, in which i have used multipleOutputs.write for writing the output fo my own file convention. I have used following line. multipleOutputs.write(new Text(line), NullWritable.get(),"srini-file"); But, i am getting file with srini-file-m-00000 with 0 records in it. However, if i have used context.write, then the output came properly. Please let me know, if there is any important point i missed in multipleoutputs.
... View more
Labels:
- Labels:
-
MapReduce
07-19-2015
03:38 AM
I am checking some concepts in Definitive guide and could not figure out this small logic. When you convert VIntWritable to Byte Array and convert that byte array to string, there is additional '8f' at the start of any avalue. for eg: for normal IntWritable the value of 172 in hexa decimal representation is - 000000ac for VIntWritable, the HexaDecimal String for 172 is 8fac for VIntWritable, the HexaDecimal String for -172 is 87ab I am confused. could you please elaborate a bit.?
... View more
07-16-2015
06:54 AM
can you try to just open the VM Player and then use the Open option on the top menu and it will open browser window and go to the vmware file path and then select the .vmx file.
... View more
07-13-2015
10:55 PM
Apologies if i haven’t put the question properly. I have a combined file format, which returns file name as key and filecontent as value. I customized Mapper class’s run method and runs the map method if the file meets specific conditions only. lets say, it calls map method if the file content is greater than 200 kb . If 200 files are sent as input, 200 mappers will commence, and if only 100 files met the criteria and ran map method, we will still have 200 output files in output folder. Is there a way, to make sure to ensure no output file should be there if the file does not have any data.? or other way around, to create files only if the data is there for files?
... View more
10-29-2014
11:31 AM
The dataset is having 100k records, but they correspond to 943 users and 1682. It will create vectors from the data and performs the similarity measure using them. Yes, its minue considering how much the distributed programming can scale upto, but interested to know how to alter the behaviour.
... View more
10-29-2014
10:15 AM
Oh is it so.. But, How can we alter this behaviour to get recommendations for all users.?
... View more
10-29-2014
07:32 AM
Hi I ran Distributed Recommender using RecommendorJob class and provided movielens data as input. The data is having 943 users but the reocmmendations come out for only some where around 500+ users. Am i doing mistake any where, i provided the CSV file with item id, user id and preference value input.
... View more
10-29-2014
06:51 AM
What is the distributed version for UserBasedRecommender.? I have checked the list of Distributed recommender jobs for collaborative filtering. I could see RecommenderJob class which is in org.apache.mahout.cf.taste.hadoop.item package. This expects the data as userid, itemid, preference value and takes the similarity class as input as well. Does this work for both user based and itembased recommendations based on the similariy class input.?
... View more
Labels:
- Labels:
-
Apache Hadoop
10-22-2014
04:45 AM
Thanks Sean. I want one more suggestion from you. I want to provide recommendations based on user profile, and item data that too considering various features. for eg:, If a user purchases and rates a book which is of french language and of thriller genre. So, out of the recommendations i got, i need to boost french & thriller books first. i am thinking few options, one is clustering based recommendation which clusters data according to genre or language etc. second one is to, plug the search engine after the recommendations. will be glad if you can suggest a way ahead. Also, does the ALS Factorizer on Implicit data peforms recommendataion based on ratings and user features as well.?
... View more
10-22-2014
01:06 AM
I have worked with ID Rescorer and the Recommendation in standalone mode. But, is there a way that we can achieve the similar process in Distributed mode as well.? The simiilarity classes in Distributed mode work in different manner, as every one of them will extend VectorSimilarityMeasure and there wont be any method as recommend as such.
... View more
- « Previous
-
- 1
- 2
- Next »