Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

What is hbase export sequence file terminated by for hive?

avatar
Master Guru

I am running hbase export table to HDFS which I want to run a hive query on.

I do this as export:

bin/hbase org.apache.hadoop.hbase.mapreduce.Export table_name file:///tmp/db_dump/

Now I want to create a hive external table. Does anyone have a example? I need to build the correct hive create table statement. Also what is the fields terminated by? this is most important for me.

1 ACCEPTED SOLUTION

avatar
Super Guru

HBase exports are in a sequence file format. You don't need to specify the field and line terminations with a sequence file. Have you tried creating the table using "STORED AS SEQUENCEFILE" with the location of the HBase export?

Another option is to use something like http://www.exmachinatech.net/projects/forqlift/ to convert the sequence file to text.

View solution in original post

4 REPLIES 4

avatar
Super Guru

HBase exports are in a sequence file format. You don't need to specify the field and line terminations with a sequence file. Have you tried creating the table using "STORED AS SEQUENCEFILE" with the location of the HBase export?

Another option is to use something like http://www.exmachinatech.net/projects/forqlift/ to convert the sequence file to text.

avatar
Master Guru

@Michael Young good point. Forest through the trees 🙂 thanks.

avatar

Hi,

how did it work for you?

avatar

Hi Michael, I had dumped HBASE data using native Export Utility. But when I try to create a hive table on top of it with "STORED AS SEQUENCEFILE", table gets created. But I am not able to query that table. I keep getting an error, I will paste below

TFetchResultsResp(status=TStatus(errorCode=0, errorMessage="java.io.IOException: java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.",

  • Bad status for request TFetchResultsReq(fetchType=0, operationHandle=TOperationHandle(hasResultSet=True, modifiedRowCount=None, operationType=0, operationId=THandleIdentifier(secret='\x05L\xb8&\x04xO\xd6\xb9\x8e\xea,\xfb\x9b\x0e\x00', guid='\xbf\x9b\xffX\x10\x9dO~\xbd{b~R=BD')), orientation=4, maxRows=100): TFetchResultsResp(status=TStatus(errorCode=0, errorMessage="java.io.IOException: java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.", sqlState=None, infoMessages=["*org.apache.hive.service.cli.HiveSQLException:java.io.IOException: java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.:14:13", 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:366', 'org.apache.hive.service.cli.operation.OperationManager:getOperationNextRowSet:OperationManager.java:277', 'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:753', 'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:438', 'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:686', 'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1553', 'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1538', 'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39', 'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39', 'org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor:process:HadoopThriftAuthBridge.java:746', 'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286', 'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1142', 'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:617', 'java.lang.Thread:run:Thread.java:745', "*java.io.IOException:java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.:18:4", 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:508', 'org.apache.hadoop.hive.ql.exec.FetchOperator:pushRow:FetchOperator.java:415', 'org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:138', 'org.apache.hadoop.hive.ql.Driver:getResults:Driver.java:1798', 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:361', "*java.io.IOException:Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.:26:8", 'org.apache.hadoop.io.SequenceFile$Reader:init:SequenceFile.java:2040', 'org.apache.hadoop.io.SequenceFile$Reader:initialize:SequenceFile.java:1878', 'org.apache.hadoop.io.SequenceFile$Reader:<init>:SequenceFile.java:1827', 'org.apache.hadoop.io.SequenceFile$Reader:<init>:SequenceFile.java:1841', 'org.apache.hadoop.mapred.SequenceFileRecordReader:<init>:SequenceFileRecordReader.java:49', 'org.apache.hadoop.mapred.SequenceFileInputFormat:getRecordReader:SequenceFileInputFormat.java:64', 'org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit:getRecordReader:FetchOperator.java:674', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getRecordReader:FetchOperator.java:324', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:446'], statusCode=3), results=None, hasMoreRows=None)