Support Questions

learninghuman · ‎03-15-2016

Environment: HDP 2.3 Sandbox

Problem: I have created a table in hive with just 2 columns. Now i want to read this in my MR code using HCatalog integration. The MR Job fails to read the table from the MySql meta-store. It uses the Derby for some reason and hence it fails with "table not found" message.

Job Client code:

public class HCatalogMRJob extends Configured implements Tool {

   public int run(String[] args) throws Exception {
        Configuration conf = getConf();
        args = new GenericOptionsParser(conf, args).getRemainingArgs();
        String inputTableName = args[0];
        String outputTableName = args[1];
        String dbName = null;
        Job job = new Job(conf, "HCatalogMRJob");
        HCatInputFormat.setInput(job, dbName, inputTableName);
        job.setInputFormatClass(HCatInputFormat.class);
        job.setJarByClass(HCatalogMRJob.class);
        job.setMapperClass(HCatalogMapper.class);
        job.setReducerClass(HCatalogReducer.class);
        job.setMapOutputKeyClass(IntWritable.class);
        job.setMapOutputValueClass(IntWritable.class);
        job.setOutputKeyClass(WritableComparable.class);
        job.setOutputValueClass(DefaultHCatRecord.class);
        HCatOutputFormat.setOutput(job, OutputJobInfo.create(dbName, outputTableName, null));
        HCatSchema s = HCatOutputFormat.getTableSchema(conf);
        System.err.println("INFO: output schema explicitly set for writing:"
                + s);
        HCatOutputFormat.setSchema(job, s);
        job.setOutputFormatClass(HCatOutputFormat.class);
        return (job.waitForCompletion(true) ? 0 : 1);
    }
    public static void main(String[] args) throws Exception {
        int exitCode = ToolRunner.run(new HCatalogMRJob(), args);
        System.exit(exitCode);
    }
}

Job Run Command:

hadoop jar mr-hcat.jar input_table out_table

Before running this command, i have set the necessary hcatalog, hive jars in the class path using the hadoop_classpath variable.

Question:

Now, how do i make the job to use the hive-site.xml correctly?

I tried setting this in the classpath using the same hadoop_classpath as mentioned above., but still it fails.

learninghuman · ‎03-16-2016

Issue resolved by setting /etc/hive/conf in the classpath Instead of /etc/hive/conf/*.

View solution in original post

learninghuman · ‎03-16-2016

Issue resolved by setting /etc/hive/conf in the classpath Instead of /etc/hive/conf/*.

Cloudera Community

Support Questions

Mapreduce and Hcatalog Integration fails to use MySql MetaStore