Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

What is hbase export sequence file terminated by for hive?

Super Guru

I am running hbase export table to HDFS which I want to run a hive query on.

I do this as export:

bin/hbase org.apache.hadoop.hbase.mapreduce.Export table_name file:///tmp/db_dump/

Now I want to create a hive external table. Does anyone have a example? I need to build the correct hive create table statement. Also what is the fields terminated by? this is most important for me.

1 ACCEPTED SOLUTION

HBase exports are in a sequence file format. You don't need to specify the field and line terminations with a sequence file. Have you tried creating the table using "STORED AS SEQUENCEFILE" with the location of the HBase export?

Another option is to use something like http://www.exmachinatech.net/projects/forqlift/ to convert the sequence file to text.

View solution in original post

4 REPLIES 4

HBase exports are in a sequence file format. You don't need to specify the field and line terminations with a sequence file. Have you tried creating the table using "STORED AS SEQUENCEFILE" with the location of the HBase export?

Another option is to use something like http://www.exmachinatech.net/projects/forqlift/ to convert the sequence file to text.

Super Guru

@Michael Young good point. Forest through the trees 🙂 thanks.

Hi,

how did it work for you?

Hi Michael, I had dumped HBASE data using native Export Utility. But when I try to create a hive table on top of it with "STORED AS SEQUENCEFILE", table gets created. But I am not able to query that table. I keep getting an error, I will paste below

TFetchResultsResp(status=TStatus(errorCode=0, errorMessage="java.io.IOException: java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.",

  • Bad status for request TFetchResultsReq(fetchType=0, operationHandle=TOperationHandle(hasResultSet=True, modifiedRowCount=None, operationType=0, operationId=THandleIdentifier(secret='\x05L\xb8&\x04xO\xd6\xb9\x8e\xea,\xfb\x9b\x0e\x00', guid='\xbf\x9b\xffX\x10\x9dO~\xbd{b~R=BD')), orientation=4, maxRows=100): TFetchResultsResp(status=TStatus(errorCode=0, errorMessage="java.io.IOException: java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.", sqlState=None, infoMessages=["*org.apache.hive.service.cli.HiveSQLException:java.io.IOException: java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.:14:13", 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:366', 'org.apache.hive.service.cli.operation.OperationManager:getOperationNextRowSet:OperationManager.java:277', 'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:753', 'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:438', 'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:686', 'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1553', 'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1538', 'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39', 'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39', 'org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor:process:HadoopThriftAuthBridge.java:746', 'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286', 'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1142', 'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:617', 'java.lang.Thread:run:Thread.java:745', "*java.io.IOException:java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.:18:4", 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:508', 'org.apache.hadoop.hive.ql.exec.FetchOperator:pushRow:FetchOperator.java:415', 'org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:138', 'org.apache.hadoop.hive.ql.Driver:getResults:Driver.java:1798', 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:361', "*java.io.IOException:Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please ensure that the configuration 'io.serializations' is properly configured, if you're using custom serialization.:26:8", 'org.apache.hadoop.io.SequenceFile$Reader:init:SequenceFile.java:2040', 'org.apache.hadoop.io.SequenceFile$Reader:initialize:SequenceFile.java:1878', 'org.apache.hadoop.io.SequenceFile$Reader:<init>:SequenceFile.java:1827', 'org.apache.hadoop.io.SequenceFile$Reader:<init>:SequenceFile.java:1841', 'org.apache.hadoop.mapred.SequenceFileRecordReader:<init>:SequenceFileRecordReader.java:49', 'org.apache.hadoop.mapred.SequenceFileInputFormat:getRecordReader:SequenceFileInputFormat.java:64', 'org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit:getRecordReader:FetchOperator.java:674', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getRecordReader:FetchOperator.java:324', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:446'], statusCode=3), results=None, hasMoreRows=None)
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.