Created 03-15-2016 09:26 PM
Environment: HDP 2.3 Sandbox
Problem: I have created a table in hive with just 2 columns. Now i want to read this in my MR code using HCatalog integration. The MR Job fails to read the table from the MySql meta-store. It uses the Derby for some reason and hence it fails with "table not found" message.
Job Client code:
public class HCatalogMRJob extends Configured implements Tool { public int run(String[] args) throws Exception { Configuration conf = getConf(); args = new GenericOptionsParser(conf, args).getRemainingArgs(); String inputTableName = args[0]; String outputTableName = args[1]; String dbName = null; Job job = new Job(conf, "HCatalogMRJob"); HCatInputFormat.setInput(job, dbName, inputTableName); job.setInputFormatClass(HCatInputFormat.class); job.setJarByClass(HCatalogMRJob.class); job.setMapperClass(HCatalogMapper.class); job.setReducerClass(HCatalogReducer.class); job.setMapOutputKeyClass(IntWritable.class); job.setMapOutputValueClass(IntWritable.class); job.setOutputKeyClass(WritableComparable.class); job.setOutputValueClass(DefaultHCatRecord.class); HCatOutputFormat.setOutput(job, OutputJobInfo.create(dbName, outputTableName, null)); HCatSchema s = HCatOutputFormat.getTableSchema(conf); System.err.println("INFO: output schema explicitly set for writing:" + s); HCatOutputFormat.setSchema(job, s); job.setOutputFormatClass(HCatOutputFormat.class); return (job.waitForCompletion(true) ? 0 : 1); } public static void main(String[] args) throws Exception { int exitCode = ToolRunner.run(new HCatalogMRJob(), args); System.exit(exitCode); } }
Job Run Command:
hadoop jar mr-hcat.jar input_table out_table
Before running this command, i have set the necessary hcatalog, hive jars in the class path using the hadoop_classpath variable.
Question:
Now, how do i make the job to use the hive-site.xml correctly?
I tried setting this in the classpath using the same hadoop_classpath as mentioned above., but still it fails.
Created 03-16-2016 08:26 PM
Issue resolved by setting /etc/hive/conf in the classpath Instead of /etc/hive/conf/*.
Created 03-16-2016 08:26 PM
Issue resolved by setting /etc/hive/conf in the classpath Instead of /etc/hive/conf/*.