Support Questions

Find answers, ask questions, and share your expertise

What is hbase export sequence file terminated by for hive?

avatar
Master Guru

I am running hbase export table to HDFS which I want to run a hive query on.

I do this as export:

bin/hbase org.apache.hadoop.hbase.mapreduce.Export table_name file:///tmp/db_dump/

Now I want to create a hive external table. Does anyone have a example? I need to build the correct hive create table statement. Also what is the fields terminated by? this is most important for me.

1 ACCEPTED SOLUTION

avatar
Super Guru

HBase exports are in a sequence file format. You don't need to specify the field and line terminations with a sequence file. Have you tried creating the table using "STORED AS SEQUENCEFILE" with the location of the HBase export?

Another option is to use something like http://www.exmachinatech.net/projects/forqlift/ to convert the sequence file to text.

View solution in original post

4 REPLIES 4

avatar
Super Guru

HBase exports are in a sequence file format. You don't need to specify the field and line terminations with a sequence file. Have you tried creating the table using "STORED AS SEQUENCEFILE" with the location of the HBase export?

Another option is to use something like http://www.exmachinatech.net/projects/forqlift/ to convert the sequence file to text.

avatar
Master Guru

@Michael Young good point. Forest through the trees 🙂 thanks.

avatar

Hi,

how did it work for you?

avatar

Hi Michael, I had dumped HBASE data using native Export Utility. But when I try to create a hive table on top of it with "STORED AS SEQUENCEFILE", table gets created. But I am not able to query that table. I keep getting an error, I will paste below

TFetchResultsResp(status=TStatus(errorCode=0, errorMessage="java.io.IOException: java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.",

  • Bad status for request TFetchResultsReq(fetchType=0, operationHandle=TOperationHandle(hasResultSet=True, modifiedRowCount=None, operationType=0, operationId=THandleIdentifier(secret='\x05L\xb8&\x04xO\xd6\xb9\x8e\xea,\xfb\x9b\x0e\x00', guid='\xbf\x9b\xffX\x10\x9dO~\xbd{b~R=BD')), orientation=4, maxRows=100): TFetchResultsResp(status=TStatus(errorCode=0, errorMessage="java.io.IOException: java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.", sqlState=None, infoMessages=["*org.apache.hive.service.cli.HiveSQLException:java.io.IOException: java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.:14:13", 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:366', 'org.apache.hive.service.cli.operation.OperationManager:getOperationNextRowSet:OperationManager.java:277', 'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:753', 'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:438', 'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:686', 'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1553', 'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1538', 'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39', 'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39', 'org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor:process:HadoopThriftAuthBridge.java:746', 'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286', 'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1142', 'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:617', 'java.lang.Thread:run:Thread.java:745', "*java.io.IOException:java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.:18:4", 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:508', 'org.apache.hadoop.hive.ql.exec.FetchOperator:pushRow:FetchOperator.java:415', 'org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:138', 'org.apache.hadoop.hive.ql.Driver:getResults:Driver.java:1798', 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:361', "*java.io.IOException:Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.:26:8", 'org.apache.hadoop.io.SequenceFile$Reader:init:SequenceFile.java:2040', 'org.apache.hadoop.io.SequenceFile$Reader:initialize:SequenceFile.java:1878', 'org.apache.hadoop.io.SequenceFile$Reader:<init>:SequenceFile.java:1827', 'org.apache.hadoop.io.SequenceFile$Reader:<init>:SequenceFile.java:1841', 'org.apache.hadoop.mapred.SequenceFileRecordReader:<init>:SequenceFileRecordReader.java:49', 'org.apache.hadoop.mapred.SequenceFileInputFormat:getRecordReader:SequenceFileInputFormat.java:64', 'org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit:getRecordReader:FetchOperator.java:674', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getRecordReader:FetchOperator.java:324', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:446'], statusCode=3), results=None, hasMoreRows=None)