About Srini_D

Srini_D · ‎07-20-2015

I have mapreduce code, in which i have used multipleOutputs.write for writing the output fo my own file convention. I have used following line. multipleOutputs.write(new Text(line), NullWritable.get(),"srini-file"); But, i am getting file with srini-file-m-00000 with 0 records in it. However, if i have used context.write, then the output came properly. Please let me know, if there is any important point i missed in multipleoutputs.

Srini_D · ‎07-19-2015

I am checking some concepts in Definitive guide and could not figure out this small logic. When you convert VIntWritable to Byte Array and convert that byte array to string, there is additional '8f' at the start of any avalue. for eg: for normal IntWritable the value of 172 in hexa decimal representation is - 000000ac for VIntWritable, the HexaDecimal String for 172 is 8fac for VIntWritable, the HexaDecimal String for -172 is 87ab I am confused. could you please elaborate a bit.?

Srini_D · ‎07-16-2015

can you try to just open the VM Player and then use the Open option on the top menu and it will open browser window and go to the vmware file path and then select the .vmx file.

Srini_D · ‎07-13-2015

Apologies if i haven’t put the question properly. I have a combined file format, which returns file name as key and filecontent as value. I customized Mapper class’s run method and runs the map method if the file meets specific conditions only. lets say, it calls map method if the file content is greater than 200 kb . If 200 files are sent as input, 200 mappers will commence, and if only 100 files met the criteria and ran map method, we will still have 200 output files in output folder. Is there a way, to make sure to ensure no output file should be there if the file does not have any data.? or other way around, to create files only if the data is there for files?

Srini_D · ‎10-29-2014

The dataset is having 100k records, but they correspond to 943 users and 1682. It will create vectors from the data and performs the similarity measure using them. Yes, its minue considering how much the distributed programming can scale upto, but interested to know how to alter the behaviour.

Srini_D · ‎10-29-2014

Oh is it so.. But, How can we alter this behaviour to get recommendations for all users.?

Srini_D · ‎10-29-2014

Hi I ran Distributed Recommender using RecommendorJob class and provided movielens data as input. The data is having 943 users but the reocmmendations come out for only some where around 500+ users. Am i doing mistake any where, i provided the CSV file with item id, user id and preference value input.

Srini_D · ‎10-29-2014

What is the distributed version for UserBasedRecommender.? I have checked the list of Distributed recommender jobs for collaborative filtering. I could see RecommenderJob class which is in org.apache.mahout.cf.taste.hadoop.item package. This expects the data as userid, itemid, preference value and takes the similarity class as input as well. Does this work for both user based and itembased recommendations based on the similariy class input.?

Srini_D · ‎10-22-2014

Thanks Sean. I want one more suggestion from you. I want to provide recommendations based on user profile, and item data that too considering various features. for eg:, If a user purchases and rates a book which is of french language and of thriller genre. So, out of the recommendations i got, i need to boost french & thriller books first. i am thinking few options, one is clustering based recommendation which clusters data according to genre or language etc. second one is to, plug the search engine after the recommendations. will be glad if you can suggest a way ahead. Also, does the ALS Factorizer on Implicit data peforms recommendataion based on ratings and user features as well.?

Srini_D · ‎10-22-2014

I have worked with ID Rescorer and the Recommendation in standalone mode. But, is there a way that we can achieve the similar process in Distributed mode as well.? The simiilarity classes in Distributed mode work in different manner, as every one of them will extend VectorSimilarityMeasure and there wont be any method as recommend as such.

Online	Offline
Last Visited	‎09-18-2019 01:00 PM

Member Since	‎09-17-2013 08:36 PM
Last Visited	‎09-18-2019 01:00 PM
Posts	63
Kudos received	5

Cloudera Community

MutlipleOutputs writing zero records to output fil...

Why HexaDecimal String of any VIntWritable has 8f ...

Re: Unable to start Cloudera QuickStart VM with VM...

how to suppress mapper output files if the output ...

Re: Distributed Recommender not giving recommendat...

Re: Distributed Recommender not giving recommendat...

Distributed Recommender not giving recommendations...

Distributed recommenderjob for user-based recommen...

Re: Mahout: How to user IDRescorer in Distributed ...

Mahout: How to user IDRescorer in Distributed mode...