Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Indexing Hbase avro data with Lily

Highlighted

Indexing Hbase avro data with Lily

New Contributor

I'm trying to extract data from Hbase and index it in Solr, but when I tried to read the avro data, I got this error: 

 

7851 [main] TRACE org.kitesdk.morphline.stdlib.LogTraceBuilder$LogTrace  - beforeNotify: {lifecycle=[START_SESSION]}
7852 [main] TRACE org.kitesdk.morphline.stdlib.Pipe  - beforeProcess: {_attachment_body=[keyvalues={1515075385173-eYMfKWoYUV-0/tweets:payload/1515075488251/Put/vlen=1416/seqid=0}], _attachment_mimetype=[application/java-hbase-result]}
7852 [main] TRACE com.ngdata.hbaseindexer.morphline.ExtractHBaseCellsBuilder$ExtractHBaseCells  - beforeProcess: {_attachment_body=[keyvalues={1515075385173-eYMfKWoYUV-0/tweets:payload/1515075488251/Put/vlen=1416/seqid=0}], _attachment_mimetype=[application/java-hbase-result]}
7853 [main] TRACE org.kitesdk.morphline.avro.ReadAvroContainerBuilder$ReadAvroContainer  - beforeProcess: {_attachment_body=[[B@382d71c7]}
Exception in thread "main" java.lang.RuntimeException: org.kitesdk.morphline.api.MorphlineRuntimeException: java.lang.IllegalArgumentException
        at com.ngdata.hbaseindexer.mr.IndexerDryRun.run(IndexerDryRun.java:140)
        at com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:95)
        at com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:89)
        at com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:79)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.main(HBaseMapReduceIndexerTool.java:73)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.kitesdk.morphline.api.MorphlineRuntimeException: java.lang.IllegalArgumentException
        at org.kitesdk.morphline.base.FaultTolerance.handleException(FaultTolerance.java:73)
        at com.ngdata.hbaseindexer.morphline.LocalMorphlineResultToSolrMapper.map(LocalMorphlineResultToSolrMapper.java:248)
        at com.ngdata.hbaseindexer.morphline.MorphlineResultToSolrMapper.map(MorphlineResultToSolrMapper.java:145)
        at com.ngdata.hbaseindexer.indexer.Indexer$ColumnBasedIndexer.calculateIndexUpdates(Indexer.java:355)
        at com.ngdata.hbaseindexer.indexer.Indexer.indexRowData(Indexer.java:144)
        at com.ngdata.hbaseindexer.mr.IndexerDryRun.run(IndexerDryRun.java:136)
        ... 11 more
Caused by: java.lang.IllegalArgumentException
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:334)
        at org.apache.avro.io.BinaryDecoder.readBytes(BinaryDecoder.java:288)
        at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:112)
        at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
        at org.kitesdk.morphline.avro.ReadAvroContainerBuilder$ReadAvroContainer.doProcess(ReadAvroContainerBuilder.java:125)
        at org.kitesdk.morphline.stdio.AbstractParser.doProcess(AbstractParser.java:96)
        at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:161)
        at org.kitesdk.morphline.base.Connector.process(Connector.java:64)
        at org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:186)
        at com.ngdata.hbaseindexer.morphline.ExtractHBaseCellsBuilder$ExtractHBaseCells.doProcess(ExtractHBaseCellsBuilder.java:86)
        at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:161)
        at org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:186)
        at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:161)
        at com.ngdata.hbaseindexer.morphline.LocalMorphlineResultToSolrMapper.map(LocalMorphlineResultToSolrMapper.java:242)
        ... 15 more

Morphilne.conf:

SOLR_LOCATOR : {
  collection : hbase-collection-twitter

  zkHost : "127.0.0.1:2181/solr"

}

morphlines : [
  {
    id : morphline1
    importCommands : ["org.kitesdk.**", "com.ngdata.**", "org.apache.solr.**"]

    commands : [
      {
        extractHBaseCells {
          mappings : [
            {
              inputColumn : "tweets:payload"
              outputField : "_attachment_body"
              type : "byte[]"
              source : value
            }
          ]
        }
      }

      { readAvroContainer { } }

      {
        extractAvroPaths {
          flatten : false
          paths : {
            id : /id
            text : /text
            user_friends_count : /user_friends_count
            user_location : /user_location
            user_description : /user_description
            user_statuses_count : /user_statuses_count
            user_followers_count : /user_followers_count
            user_name : /user_name
            user_screen_name : /user_screen_name
            created_at : /created_at
            retweet_count : /retweet_count
            retweeted : /retweeted
            in_reply_to_user_id : /in_reply_to_user_id
            source : /source
            in_reply_to_status_id : /in_reply_to_status_id
            media_url_https : /media_url_https
            expanded_url : /expanded_url
          }
        }
      }

      {
        sanitizeUnknownSolrFields {
            solrLocator : ${SOLR_LOCATOR}

         }
      }

      { logTrace { format : "output record: {}", args : ["@{}"] } }
    ]
  }
]

I don't know why I got the error above. I tried to get the avro data on Flume and index it to Solr using the same configuration but without use extractHbaseCells and it works.