Reply
New Contributor
Posts: 2
Registered: ‎01-17-2014

Distributed cache files retrieval

[ Edited ]

Hi there,

  we are using Hadoop 2.0.0-cdh4.4.0 in our company. I am trying to use distributed cache feature. Getting the files from cache in mapper/reducer involves one of these methods:

  context.getLocalCacheFiles();
  context.getCacheFiles();
  DistributedCache.getLocalCacheFiles();
  DistributedCache.getCacheFiles();

  Each of them returns Path[] or URI[]. Sometimes you need to store more then one files. The problem is that you need to be able to say which one is which. Example:

  job.addCacheFiles(new Path("/dir/setA.txt"));
  job.addCacheFiles(new Path("/dir/setB.txt"));
 
  URI[] uris = context.getCacheFiles();
  //uris[0] - setA or setB?

  Thank you in advance!

  Jakub

edit: moreover I have found out that method job.addCacheFiles which is only non deprecated for adding files to cache gives me NoSuchMethodException on server even though server cdh version and maven dependencies are of same version 2.0.0-cdh4.4.0. and maven builds it without error. I am going to read it directly from hdfs for now...

Posts: 416
Topics: 51
Kudos: 86
Solutions: 49
Registered: ‎06-26-2013

Re: Distributed cache files retrieval

I have moved this thread to the Mapreduce board in hopes someone here can assist you.

New Contributor
Posts: 2
Registered: ‎01-17-2014

Re: Distributed cache files retrieval

Thank you. I hope there is someone using this feature and willing to help at the same time :)

Announcements