Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Running mapreduce on Hbase Exported Table throws Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result

avatar
Rising Star

I have taken the Hbase table backup using Hbase Export utility tool .

I got all data transferred into HDFS correctly in sequence file format .

Now i want to run mapreduce to read the key value from the output file but getting below exception

 
    java.lang.Exception: java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:406)
    Caused by: java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.
    at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1964)
    at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1811)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1760)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1774)
    at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:50)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:478)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:671)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)

Here is my driver code

package SEQ;
    import org.apache.hadoop.conf.Configured;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.hbase.client.Result;
    import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    import org.apache.hadoop.mapreduce.Job;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
    import org.apache.hadoop.util.Tool;
    import org.apache.hadoop.util.ToolRunner;
    public class SeqDriver extends Configured implements Tool 
    {
    public static void main(String[] args) throws Exception{
    int exitCode = ToolRunner.run(new SeqDriver(), args);
    System.exit(exitCode);
    }

    public int run(String[] args) throws Exception {
    if (args.length != 2) {
    System.err.printf("Usage: %s needs two arguments   files\n",
    getClass().getSimpleName());
    return -1;
    }
    String outputPath = args[1];

    FileSystem hfs = FileSystem.get(getConf());
    Job job = new Job();
    job.setJarByClass(SeqDriver.class);
    job.setJobName("SequenceFileReader");

    HDFSUtil.removeHdfsSubDirIfExists(hfs, new Path(outputPath), true);

    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    job.setOutputKeyClass(ImmutableBytesWritable.class);
job.setOutputValueClass(Result.class);

    job.setInputFormatClass(SequenceFileInputFormat.class);

    job.setMapperClass(MySeqMapper.class);

    job.setNumReduceTasks(0);
    int returnValue = job.waitForCompletion(true) ? 0:1;

    if(job.isSuccessful()) {
    System.out.println("Job was successful");
    } else if(!job.isSuccessful()) {
    System.out.println("Job was not successful");
    }

    return returnValue;
    }
    } 

Here is my mapper code

    package SEQ;
    import java.io.IOException;
    import org.apache.hadoop.hbase.client.Result;
    import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mapreduce.Mapper;
    public class MySeqMapper extends Mapper <ImmutableBytesWritable, Result, Text, Text>{

    @Override
        public void map(ImmutableBytesWritable row, Result value,Context context)
        throws IOException, InterruptedException {
        }
      } 
1 ACCEPTED SOLUTION

avatar
Rising Star

So i will answer my question here is what was needed to make it work Because we use HBase to store our data and this reducer outputs its result to HBase table, Hadoop is telling us that he doesn’t know how to serialize our data. That is why we need to help it. Inside setUp set the io.serializations variable

    hbaseConf.setStrings("io.serializations", new String[]{hbaseConf.get("io.serializations"), MutationSerialization.class.getName(), ResultSerialization.class.getName()});

View solution in original post

4 REPLIES 4

avatar
Master Mentor

@sudarshan kumar

Can you please share the Whole "io.serializations" property configuration from "core-site.xml", Looks like it is not set properly.

- I remember one such issue with the "io.serializations" property definition had a <final>true</final> in it which was causing issue. Please check if you have similar issue. Please try removing the <final>true</final> tag line if you find it inside your "io.serializations" property definition.

OR the value of this property might not be set properly.

.

avatar
Rising Star

Hi Jay,

Thanks for responding

I don't have such property in the core-site.xml .

Here is the details also .

  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://quickstart.cloudera:8020</value>
  </property>
  <!-- OOZIE proxy user setting -->
  <property>
    <name>hadoop.proxyuser.oozie.hosts</name>
    <value>*</value>
  </property>
  <property>
    <name>hadoop.proxyuser.oozie.groups</name>
    <value>*</value>
  </property>
  <!-- HTTPFS proxy user setting -->
  <property>
    <name>hadoop.proxyuser.httpfs.hosts</name>
    <value>*</value>
  </property>
  <property>
    <name>hadoop.proxyuser.httpfs.groups</name>
    <value>*</value>
  </property>
  <!-- Llama proxy user setting -->
  <property>
    <name>hadoop.proxyuser.llama.hosts</name>
    <value>*</value>
  </property>
  <property>
    <name>hadoop.proxyuser.llama.groups</name>
    <value>*</value>
  </property>
  <!-- Hue proxy user setting -->
  <property>
    <name>hadoop.proxyuser.hue.hosts</name>
    <value>*</value>
  </property>
  <property>
    <name>hadoop.proxyuser.hue.groups</name>
    <value>*</value>
  </property>
</configuration>

avatar
Rising Star

So i will answer my question here is what was needed to make it work Because we use HBase to store our data and this reducer outputs its result to HBase table, Hadoop is telling us that he doesn’t know how to serialize our data. That is why we need to help it. Inside setUp set the io.serializations variable

    hbaseConf.setStrings("io.serializations", new String[]{hbaseConf.get("io.serializations"), MutationSerialization.class.getName(), ResultSerialization.class.getName()});

avatar
New Contributor

Can you please share a sample java code for reading the hadoop sequential file which has hbase.io.ImmutableBytesWritable as Key class and hbase.client.Results as value class?

Need to read from input stream which can read from hdfs. Would like to write it into output stream. My input stream shows the file can be read from hdfs but I cannot parse it. So need to build a parser for same.