Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

table import failed - MetaStore Tables

Highlighted

table import failed - MetaStore Tables

Explorer

Hi,

 

I downloaded CM5 and installed it successfully. I am having issues when I try to create a table and import the content from a csv file into the table through MetaStore Manager.

 

The Table structure is created, but the data is not imported. I tried to import the data by enabling overwrite the data, but still it failed.

 

Log Contents

=========

 

15/01/10 11:21:54 INFO log.PerfLogger: <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:54 INFO log.PerfLogger: <PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:54 INFO parse.ParseDriver: Parsing command: CREATE TABLE `default.A123`
(
`BuildingID` tinyint ,
`BuildingMgr` string ,
`BuildingAge` tinyint ,
`HVACproduct` string ,
`Country` string )
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
15/01/10 11:21:54 INFO parse.ParseDriver: Parse Completed
15/01/10 11:21:54 INFO log.PerfLogger: </PERFLOG method=parse start=1420906914928 end=1420906914931 duration=3 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:54 INFO log.PerfLogger: <PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:54 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
15/01/10 11:21:54 INFO parse.SemanticAnalyzer: Creating table default.A123 position=13
15/01/10 11:21:54 INFO ql.Driver: Semantic Analysis Completed
15/01/10 11:21:54 INFO log.PerfLogger: </PERFLOG method=semanticAnalyze start=1420906914931 end=1420906914935 duration=4 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:54 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
15/01/10 11:21:54 INFO ql.Driver: EXPLAIN output for queryid hive_20150110112121_72d8abfe-57cf-49da-bde3-e12f950302bf : ABSTRACT SYNTAX TREE:

TOK_CREATETABLE
TOK_TABNAME
default.A123
TOK_LIKETABLE
TOK_TABCOLLIST
TOK_TABCOL
BuildingID
TOK_TINYINT
TOK_TABCOL
BuildingMgr
TOK_STRING
TOK_TABCOL
BuildingAge
TOK_TINYINT
TOK_TABCOL
HVACproduct
TOK_STRING
TOK_TABCOL
Country
TOK_STRING
TOK_TABLEROWFORMAT
TOK_SERDEPROPS
TOK_TABLEROWFORMATFIELD
','


STAGE DEPENDENCIES:
Stage-0 is a root stage [DDL]

STAGE PLANS:
Stage: Stage-0
Create Table Operator:
Create Table
columns: buildingid tinyint, buildingmgr string, buildingage tinyint, hvacproduct string, country string
field delimiter: ,
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat
serde name: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
name: default.A123


15/01/10 11:21:54 INFO log.PerfLogger: </PERFLOG method=compile start=1420906914928 end=1420906914941 duration=13 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:54 INFO log.PerfLogger: <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:54 INFO log.PerfLogger: <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:54 INFO log.PerfLogger: <PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:54 INFO lockmgr.DummyTxnManager: Creating lock manager of type org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
15/01/10 11:21:54 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=master.cluster.com:2181 sessionTimeout=600000 watcher=org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager$DummyWatcher@466555f
15/01/10 11:21:55 INFO log.PerfLogger: </PERFLOG method=acquireReadWriteLocks start=1420906914943 end=1420906915280 duration=337 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:55 INFO log.PerfLogger: <PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:55 INFO ql.Driver: Starting command: CREATE TABLE `default.A123`
(
`BuildingID` tinyint ,
`BuildingMgr` string ,
`BuildingAge` tinyint ,
`HVACproduct` string ,
`Country` string )
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
15/01/10 11:21:55 INFO log.PerfLogger: </PERFLOG method=TimeToSubmit start=1420906914943 end=1420906915281 duration=338 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:55 INFO log.PerfLogger: <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:55 INFO log.PerfLogger: <PERFLOG method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:55 INFO log.PerfLogger: </PERFLOG method=runTasks start=1420906915281 end=1420906915502 duration=221 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:55 INFO log.PerfLogger: <PERFLOG method=PostHook.com.cloudera.navigator.audit.hive.HiveExecHookContext from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:55 INFO log.PerfLogger: </PERFLOG method=PostHook.com.cloudera.navigator.audit.hive.HiveExecHookContext start=1420906915502 end=1420906915503 duration=1 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:55 INFO log.PerfLogger: </PERFLOG method=Driver.execute start=1420906915281 end=1420906915503 duration=222 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:55 INFO ql.Driver: OK
15/01/10 11:21:55 INFO log.PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:55 INFO ZooKeeperHiveLockManager: about to release lock for default
15/01/10 11:21:55 INFO log.PerfLogger: </PERFLOG method=releaseLocks start=1420906915504 end=1420906915574 duration=70 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 11:21:55 INFO log.PerfLogger: </PERFLOG method=Driver.run start=1420906914942 end=1420906915574 duration=632 from=org.apache.hadoop.hive.ql.Driver>

8 REPLIES 8
Highlighted

Re: table import failed - MetaStore Tables

These logs are only about the create table, not the load. Did you get more
logs? They should be on the /logs page of Hue.

You can try to load the data directly from the table page in the metastore
page to try to get more info.

Romain

Highlighted

Re: table import failed - MetaStore Tables

Explorer

Hi, thanks for the reply. Initially when creating the table, I checked to load the data also. But, I followed your instructions and I got the data in the table.  But, why it is NOT importing the data at the first place itself ?

 

15/01/10 20:59:01 INFO log.PerfLogger: <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:01 INFO log.PerfLogger: <PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:01 INFO parse.ParseDriver: Parsing command: LOAD DATA INPATH '/user/hdfs/building.csv' INTO TABLE `default.build`
15/01/10 20:59:01 INFO parse.ParseDriver: Parse Completed
15/01/10 20:59:01 INFO log.PerfLogger: </PERFLOG method=parse start=1420941541329 end=1420941541332 duration=3 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:01 INFO log.PerfLogger: <PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:01 INFO ql.Driver: Semantic Analysis Completed
15/01/10 20:59:01 INFO log.PerfLogger: </PERFLOG method=semanticAnalyze start=1420941541332 end=1420941541364 duration=32 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:01 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
15/01/10 20:59:01 INFO ql.Driver: EXPLAIN output for queryid hive_20150110205959_7c66b89d-706f-4dd0-b3ba-8f3cd3668017 : ABSTRACT SYNTAX TREE:

TOK_LOAD
'/user/hdfs/building.csv'
TOK_TAB
TOK_TABNAME
default.build


STAGE DEPENDENCIES:
Stage-0 is a root stage [MOVE]
Stage-1 depends on stages: Stage-0 [STATS]

STAGE PLANS:
Stage: Stage-0
Move Operator
tables:
replace: false
source: hdfs://master.cluster.com:8020/user/hdfs/building.csv
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
properties:
bucket_count -1
columns buildingid,buildingmgr,buildingage,hvacproduct,country
columns.comments
columns.types int:string:int:string:string
field.delim ,
file.inputformat org.apache.hadoop.mapred.TextInputFormat
file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
location hdfs://master.cluster.com:8020/user/hive/warehouse/build
name default.build
serialization.ddl struct build { i32 buildingid, string buildingmgr, i32 buildingage, string hvacproduct, string country}
serialization.format ,
serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
transient_lastDdlTime 1420906411
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
name: default.build

Stage: Stage-1
Stats-Aggr Operator


15/01/10 20:59:01 INFO log.PerfLogger: </PERFLOG method=compile start=1420941541329 end=1420941541375 duration=46 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:01 INFO log.PerfLogger: <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:01 INFO log.PerfLogger: <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:01 INFO log.PerfLogger: <PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:01 INFO lockmgr.DummyTxnManager: Creating lock manager of type org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
15/01/10 20:59:01 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=master.cluster.com:2181 sessionTimeout=600000 watcher=org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager$DummyWatcher@3e4e92e3
15/01/10 20:59:01 INFO log.PerfLogger: </PERFLOG method=acquireReadWriteLocks start=1420941541377 end=1420941541539 duration=162 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:01 INFO log.PerfLogger: <PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:01 INFO ql.Driver: Starting command: LOAD DATA INPATH '/user/hdfs/building.csv' INTO TABLE `default.build`
15/01/10 20:59:01 INFO log.PerfLogger: </PERFLOG method=TimeToSubmit start=1420941541377 end=1420941541542 duration=165 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:01 INFO log.PerfLogger: <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:01 INFO log.PerfLogger: <PERFLOG method=task.MOVE.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:01 INFO exec.Task: Loading data to table default.build from hdfs://master.cluster.com:8020/user/hdfs/building.csv
15/01/10 20:59:01 INFO metadata.Hive: Renaming src: hdfs://master.cluster.com:8020/user/hdfs/building.csv, dest: hdfs://master.cluster.com:8020/user/hive/warehouse/build/building.csv, Status:true
15/01/10 20:59:02 INFO log.PerfLogger: <PERFLOG method=task.STATS.Stage-1 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:02 INFO exec.StatsTask: Executing stats task
15/01/10 20:59:03 INFO exec.Task: Table default.build stats: [numFiles=1, numRows=0, totalSize=544, rawDataSize=0]
15/01/10 20:59:03 INFO log.PerfLogger: </PERFLOG method=runTasks start=1420941541542 end=1420941543089 duration=1547 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:03 INFO log.PerfLogger: <PERFLOG method=PostHook.com.cloudera.navigator.audit.hive.HiveExecHookContext from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:03 INFO log.PerfLogger: </PERFLOG method=PostHook.com.cloudera.navigator.audit.hive.HiveExecHookContext start=1420941543090 end=1420941543091 duration=1 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:03 INFO log.PerfLogger: </PERFLOG method=Driver.execute start=1420941541539 end=1420941543091 duration=1552 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:03 INFO ql.Driver: OK
15/01/10 20:59:03 INFO log.PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:03 INFO ZooKeeperHiveLockManager: about to release lock for default/build
15/01/10 20:59:03 INFO ZooKeeperHiveLockManager: about to release lock for default
15/01/10 20:59:03 INFO log.PerfLogger: </PERFLOG method=releaseLocks start=1420941543092 end=1420941543129 duration=37 from=org.apache.hadoop.hive.ql.Driver>
15/01/10 20:59:03 INFO log.PerfLogger: </PERFLOG method=Driver.run start=1420941541377 end=1420941543130 duration=1753 from=org.apache.hadoop.hive.ql.Driver>

Highlighted

Re: table import failed - MetaStore Tables

We would need the logs of the run of the initial create + load data, the
problem could be in Hue, Hive, HDFS configuration/bug.

Romain

Highlighted

Re: table import failed - MetaStore Tables

Explorer

Can you let me know the directories, so that I can give you the correct information. 

 

I installed from CM5.

Re: table import failed - MetaStore Tables

Explorer

Hue Log - 

 

[12/Jan/2015 20:48:39 -0800] upload ERROR Left-over upload file is not cleaned up: /user/hue/HVAC1.csv.tmp
[12/Jan/2015 20:49:01 -0800] base ERROR Internal Server Error: /beeswax/create/auto_load/default
Traceback (most recent call last):
File "/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hue/build/env/lib/python2.6/site-packages/Django-1.4.5-py2.6.egg/django/core/handlers
response = callback(request, *callback_args, **callback_kwargs)
File "/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hue/apps/beeswax/src/beeswax/create_table.py", line 471, in load_after_create
remove_header(request.fs, path)
File "/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hue/desktop/libs/hadoop/src/hadoop/fs/fsutils.py", line 53, in remove_header
_do_overwrite(fs, path, copy_data)
File "/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hue/desktop/libs/hadoop/src/hadoop/fs/fsutils.py", line 70, in _do_overwrite
copy_data(path_dest)
File "/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hue/desktop/libs/hadoop/src/hadoop/fs/fsutils.py", line 51, in copy_data
fs.copyfile(path, path_dest, skip_header=True)
File "/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hue/desktop/libs/hadoop/src/hadoop/fs/webhdfs.py", line 564, in copyfile
n = data.index('\n')
ValueError: substring not found

 

 

 

CSV File - 

Date,Time,TargetTemp,ActualTemp,System,SystemAge,BuildingID

6/1/13,0:00:01,66,58,13,20,4

6/2/13,1:00:01,69,68,3,20,17

6/3/13,2:00:01,70,73,17,20,18

6/4/13,3:00:01,67,63,2,23,15

Highlighted

Re: table import failed - MetaStore Tables

I can't recreate this error with

Date,Time,TargetTemp,ActualTemp,System,SystemAge,BuildingID
6/1/13,0:00:01,66,58,13,20,4
6/2/13,1:00:01,69,68,3,20,17
6/3/13,2:00:01,70,73,17,20,18
6/4/13,3:00:01,67,63,2,23,15

Do you have extra new lines?

I see a little problem though where the data is loaded but Hue does not
redirect to the table page (so need to open /metastore/tables/ page)

Romain

Highlighted

Re: table import failed - MetaStore Tables

Explorer

Hi,

 

I dont have any extra lines.

Highlighted

Re: table import failed - MetaStore Tables

Could you share the file? I can't repro it when I copy paste the csv you
put.

Romain

Don't have an account?
Coming from Hortonworks? Activate your account here