Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎01-05-2018

Indexing Hbase avro data with Lily

[ Edited ]

I'm trying to extract data from Hbase and index it in Solr, but when I tried to read the avro data, I got this error: 

 

7851 [main] TRACE org.kitesdk.morphline.stdlib.LogTraceBuilder$LogTrace  - beforeNotify: {lifecycle=[START_SESSION]}
7852 [main] TRACE org.kitesdk.morphline.stdlib.Pipe  - beforeProcess: {_attachment_body=[keyvalues={1515075385173-eYMfKWoYUV-0/tweets:payload/1515075488251/Put/vlen=1416/seqid=0}], _attachment_mimetype=[application/java-hbase-result]}
7852 [main] TRACE com.ngdata.hbaseindexer.morphline.ExtractHBaseCellsBuilder$ExtractHBaseCells  - beforeProcess: {_attachment_body=[keyvalues={1515075385173-eYMfKWoYUV-0/tweets:payload/1515075488251/Put/vlen=1416/seqid=0}], _attachment_mimetype=[application/java-hbase-result]}
7853 [main] TRACE org.kitesdk.morphline.avro.ReadAvroContainerBuilder$ReadAvroContainer  - beforeProcess: {_attachment_body=[[B@382d71c7]}
Exception in thread "main" java.lang.RuntimeException: org.kitesdk.morphline.api.MorphlineRuntimeException: java.lang.IllegalArgumentException
        at com.ngdata.hbaseindexer.mr.IndexerDryRun.run(IndexerDryRun.java:140)
        at com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:95)
        at com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:89)
        at com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:79)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.main(HBaseMapReduceIndexerTool.java:73)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.kitesdk.morphline.api.MorphlineRuntimeException: java.lang.IllegalArgumentException
        at org.kitesdk.morphline.base.FaultTolerance.handleException(FaultTolerance.java:73)
        at com.ngdata.hbaseindexer.morphline.LocalMorphlineResultToSolrMapper.map(LocalMorphlineResultToSolrMapper.java:248)
        at com.ngdata.hbaseindexer.morphline.MorphlineResultToSolrMapper.map(MorphlineResultToSolrMapper.java:145)
        at com.ngdata.hbaseindexer.indexer.Indexer$ColumnBasedIndexer.calculateIndexUpdates(Indexer.java:355)
        at com.ngdata.hbaseindexer.indexer.Indexer.indexRowData(Indexer.java:144)
        at com.ngdata.hbaseindexer.mr.IndexerDryRun.run(IndexerDryRun.java:136)
        ... 11 more
Caused by: java.lang.IllegalArgumentException
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:334)
        at org.apache.avro.io.BinaryDecoder.readBytes(BinaryDecoder.java:288)
        at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:112)
        at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
        at org.kitesdk.morphline.avro.ReadAvroContainerBuilder$ReadAvroContainer.doProcess(ReadAvroContainerBuilder.java:125)
        at org.kitesdk.morphline.stdio.AbstractParser.doProcess(AbstractParser.java:96)
        at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:161)
        at org.kitesdk.morphline.base.Connector.process(Connector.java:64)
        at org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:186)
        at com.ngdata.hbaseindexer.morphline.ExtractHBaseCellsBuilder$ExtractHBaseCells.doProcess(ExtractHBaseCellsBuilder.java:86)
        at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:161)
        at org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:186)
        at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:161)
        at com.ngdata.hbaseindexer.morphline.LocalMorphlineResultToSolrMapper.map(LocalMorphlineResultToSolrMapper.java:242)
        ... 15 more

Morphilne.conf:

SOLR_LOCATOR : {
  collection : hbase-collection-twitter

  zkHost : "127.0.0.1:2181/solr"

}

morphlines : [
  {
    id : morphline1
    importCommands : ["org.kitesdk.**", "com.ngdata.**", "org.apache.solr.**"]

    commands : [
      {
        extractHBaseCells {
          mappings : [
            {
              inputColumn : "tweets:payload"
              outputField : "_attachment_body"
              type : "byte[]"
              source : value
            }
          ]
        }
      }

      { readAvroContainer { } }

      {
        extractAvroPaths {
          flatten : false
          paths : {
            id : /id
            text : /text
            user_friends_count : /user_friends_count
            user_location : /user_location
            user_description : /user_description
            user_statuses_count : /user_statuses_count
            user_followers_count : /user_followers_count
            user_name : /user_name
            user_screen_name : /user_screen_name
            created_at : /created_at
            retweet_count : /retweet_count
            retweeted : /retweeted
            in_reply_to_user_id : /in_reply_to_user_id
            source : /source
            in_reply_to_status_id : /in_reply_to_status_id
            media_url_https : /media_url_https
            expanded_url : /expanded_url
          }
        }
      }

      {
        sanitizeUnknownSolrFields {
            solrLocator : ${SOLR_LOCATOR}

         }
      }

      { logTrace { format : "output record: {}", args : ["@{}"] } }
    ]
  }
]

I don't know why I got the error above. I tried to get the avro data on Flume and index it to Solr using the same configuration but without use extractHbaseCells and it works. 

Announcements
The Kite SDK is a collection of docs, sample code, APIs, and tools to make Hadoop application development faster. Learn more at http://kitesdk.org.