The command to add the Mapreduce job for Lily HBase NRT Indexer Service is failing

akhettar — Fri, 16 Sep 2022 09:13:50 GMT

Last week I was experimenting with Lily Hbase indexer serivice. I've downloaed the cloudera quick start VM and configured the morphline and the indexing seems to work well. However, when I manually install Hbase, Soloar cloud and the Lily indexer service - bear in mind that all these are downloaded from Cloudera download page - I get the error below.

I have two VMs set up as follow:

VM1: hbase1, hbase2 & hbase3 running Zookeeper, Hadoop, HBase, Mapreduce & Yarn
VM2: running SOLR & the Lily Indexer

The command to add the Mapreduce job:

hadoop --config /etc/hadoop/conf jar /usr/lib/hbase-solr/tools/hbase-indexer-mr-1.5-cdh5.2.0-job.jar --conf /etc/hbase/conf/hbase-site.xml -Dmapred.child.java.opts=-Xmx500m --log4j /etc/hbase-solr/conf/log4j.properties --hbase-indexer-zk hbase1:2181,hbase2:2181,hbase3:2181 --hbase-indexer-file /etc/hbase-solr/conf/morphline-indexer-mapper.xml --hbase-indexer-name portalaudit --zk-host hbase1:2181,hbase2:2181,hbase3:2181/solr --collection portal-audit --go-live

This spits out lots of content, but when it gets to fiddling with the Jars Lily looks under hdfs:// for the Jars. There are a handful of posts on the Internet with the same problem, but none of them have decent answers. The only one with a possible answer suggests to upload the Jars into HDFS, but that feels wrong and is a complete workaround that will probably break at some point.

The exception when adding the Mapreduce job:

14/11/19 16:12:49 INFO zookeeper.ClientCnxn: EventThread shut down
14/11/19 16:12:49 INFO hadoop.ForkedMapReduceIndexerTool: Indexing data into 1 reducers
14/11/19 16:12:49 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
14/11/19 16:12:50 INFO mapreduce.JobSubmitter: Cleaning up the staging area file:/tmp/hadoop-root/mapred/staging/root850902700/.staging/job_local850902700_0001
Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://3xNodeHA/usr/lib/hadoop/lib/guava-11.0.2.jar
        at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1083)
        at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1075)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1075)
        at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
        at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
        at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
        at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
        at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
        at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)

Ignoring the above error resut in failure to index the relevant fields in Solar. Any help is very much appreciated.

Thanks

Ayache

Re: The command to add the Mapreduce job for Lily HBase NRT Indexer Service is failing

akhettar — Fri, 21 Nov 2014 16:00:00 GMT

It turned out that the hadoop Job below is only required for batch indexing.. so not needed for now. All seems to be working fine now.

question The command to add the Mapreduce job for Lily HBase NRT Indexer Service is failing in Archives of Support Questions (Read Only)

The command to add the Mapreduce job for Lily HBase NRT Indexer Service is failing

Re: The command to add the Mapreduce job for Lily HBase NRT Indexer Service is failing