Member since
02-12-2014
8
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1999 | 02-19-2014 08:42 AM |
02-12-2014
01:08 PM
I have a large pile of web pages in HBase that I'm trying to index into Cloudera Search following the online docs. I'm running the job like so: hadoop jar /usr/lib/hbase-solr/tools/hbase-indexer-mr-1.3-search-1.1.0-job.jar --hbase-table-name clueweb12 --zk-host 192.168.0.1/solr --collection cw12 --morphline-file morphlines.conf --hbase-indexer-file morphline-hbase-mapper.xml --reducers 0 ... and this runs just fine: documents are indexed following the morphline spec I gave it. Except, it's running everything as a local job on the machine I launched the job from. In other words, no mappers anywhere else on my cluster. Log messages from INFO mapred.LocalJobRunner. At this rate it'll take several months 😉 The cluster is working otherwise fine... MR and MRv2 jobs work, HDFS all ok, HBase fine, Solr fine, all on CDH4.5. I get an odd error message but it doesn't stop the job: 14/02/12 09:35:32 ERROR mapreduce.TableInputFormatBase: Cannot resolve the host name for /192.168.0.7 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '7.0.168.192.in-addr.arpa' I don't know if this is a red herring or not. It shouldn't be happening... everything is using static IPs in /etc/hosts. And as I said everything otherwise is working, it's just that this particular jar won't run parallel. How do I figure out why this job won't go MR? Thanks, Ian
... View more
Labels: