Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

The command to add the Mapreduce job for Lily HBase NRT Indexer Service is failing

avatar
Explorer

Hi

 

Last week I was experimenting with Lily Hbase indexer serivice. I've downloaed the cloudera quick start VM and configured the morphline and the indexing seems to work well. However, when I manually install Hbase, Soloar cloud and the Lily indexer service - bear in mind that all these are downloaded from Cloudera download page - I get the error below.

 

I have two VMs set up as follow: 

  • VM1: hbase1, hbase2 & hbase3 running Zookeeper, Hadoop, HBase, Mapreduce & Yarn
  • VM2: running SOLR & the Lily Indexer

 

The command to add the Mapreduce job:

hadoop --config /etc/hadoop/conf jar /usr/lib/hbase-solr/tools/hbase-indexer-mr-1.5-cdh5.2.0-job.jar --conf /etc/hbase/conf/hbase-site.xml -Dmapred.child.java.opts=-Xmx500m --log4j /etc/hbase-solr/conf/log4j.properties --hbase-indexer-zk hbase1:2181,hbase2:2181,hbase3:2181 --hbase-indexer-file /etc/hbase-solr/conf/morphline-indexer-mapper.xml --hbase-indexer-name portalaudit --zk-host hbase1:2181,hbase2:2181,hbase3:2181/solr --collection portal-audit --go-live

 

This spits out lots of content, but when it gets to fiddling with the Jars Lily looks under hdfs:// for the Jars.  There are a handful of posts on the Internet with the same problem, but none of them have decent answers.  The only one with a possible answer suggests to upload the Jars into HDFS, but that feels wrong and is a complete workaround that will probably break at some point.

The exception when adding the Mapreduce job:

14/11/19 16:12:49 INFO zookeeper.ClientCnxn: EventThread shut down
14/11/19 16:12:49 INFO hadoop.ForkedMapReduceIndexerTool: Indexing data into 1 reducers
14/11/19 16:12:49 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
14/11/19 16:12:50 INFO mapreduce.JobSubmitter: Cleaning up the staging area file:/tmp/hadoop-root/mapred/staging/root850902700/.staging/job_local850902700_0001
Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://3xNodeHA/usr/lib/hadoop/lib/guava-11.0.2.jar
        at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1083)
        at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1075)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1075)
        at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
        at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
        at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
        at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
        at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
        at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)

 

Ignoring the above error resut in failure to index the relevant fields in Solar. Any help is very much appreciated.

 

Thanks

 

Ayache

 

 

1 ACCEPTED SOLUTION

avatar
Explorer

It turned out that the hadoop Job below is only required for batch indexing.. so not needed for now. All seems to be working fine now.

View solution in original post

1 REPLY 1

avatar
Explorer

It turned out that the hadoop Job below is only required for batch indexing.. so not needed for now. All seems to be working fine now.