Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

MapReduce application that does a lookup of HBase table without extending the TableMaper/TableReducer class

MapReduce application that does a lookup of HBase table without extending the TableMaper/TableReducer class

Rising Star

Hi

I wrote an MR application that does a lookup of HBase table without extending TableMapper/TableReducer classes!!

The reason why i did not extend TableMapper/TableReducer classes is, HBase Table is not a source/destination for me, rather, i want to do a lookup on some table based on the content of my raw input !!

The Hadoop cluster i am executing is a secure cluster... which is what causing my application to not execute and throw some errors regarding to security !

Below is the sample code i wrote:

public class MyMapper extends Mapper<LongWritable, Text, Text, Text> {

    private Logger LOG = LoggerFactory.getLogger(MyMapper.class);

    public void setup(Context context) throws IOException, InterruptedException {

        System.out.println("Entered Setup method in Mapper.");
        LOG.info("Entered Setup method in Mapper.");

        Configuration config = HBaseConfiguration.create();

        Connection connection = ConnectionFactory.createConnection(config);
        Table table = connection.getTable(TableName.valueOf(<HBASE_Table>));

        Get get = new Get(Bytes.toBytes(<row_key>));
        get.addColumn(Bytes.toBytes(<col_family_name>), Bytes.toBytes(<qualifier_name>));

        Result result = table.get(get);

        byte[] latitude = result.getValue(Bytes.toBytes(<col_family_name>),
                Bytes.toBytes(<qualifier_name>));

        System.out.println(Bytes.toString(<qualifier_name>).toString());

        table.close();
        connection.close();

        System.out.println("completed Setup method in Mapper.");
        LOG.info("completed Setup method in Mapper.");

    }

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

//// dummy code goes here

    }


}

Please find below the exception i am facing:

 WARN [main-SendThread(localhost.localdomain:2181)] org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)

Can someone pls help me fix the issue ??

2 REPLIES 2

Re: MapReduce application that does a lookup of HBase table without extending the TableMaper/TableReducer class

New Contributor

Is your zookeeper running? Is hbase-site.xml set in the classpath of the mapper?

Highlighted

Re: MapReduce application that does a lookup of HBase table without extending the TableMaper/TableReducer class

Guru

HBase client running inside the mapper has to be able to connect to HBase with kerberos authentication. Normally if this was not a MR job, the client application is usually run within a login context where the keytabs are deployed and kinit has been run.

When the hbase client is run inside a task (mapper or reducer) in MR or in Spark, the client still has to authenticate. Since these tasks are run in a different node and a different process, the original TGT the application has obtained is obviously not available. TableInputFormat solves this case automatically by obtaining "delegation tokens" and distributing these delegation tokens together with the MR job. Luckily, these are already built in MR framework, so that token distribution will be taken care of.

You can call:

	TableMapReduceUtil.initCredentials(job) 

before you submit your job, so that the Connection that you create in the Mapper side will automatically get the tokens and be able to connect to HBase. Notice that you should still kinit() or login with keytab in the application who submits the MR job.