Member since
08-20-2017
2
Posts
0
Kudos Received
0
Solutions
08-21-2017
02:40 AM
Has this been solved ? Or has anyone seen a workaround ?
... View more
08-20-2017
08:28 AM
Hi, I've recently tried ran into an issue where we need to use multi delimited delimiter. In hive using the org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe serde works great. Data Sample: mandt,description,systemid
090,no comma 01,10
090,this is a, test,10
090,we can see~1,d,10
090,comma,commacomma,,10
090,no comma 02,10 Table created : CREATE EXTERNAL TABLE `amt_multi`(
`mandt` varchar(3) COMMENT 'from deserializer',
`description` varchar(200) COMMENT 'from deserializer',
`systemid` int COMMENT 'from deserializer')
ROW FORMAT SERDE
'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
WITH SERDEPROPERTIES (
'field.delim'='<|>',
'line.delim'='/n')
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://hdfsha1/DEV/Raw_STAGING/Stg_GIS/multi'
TBLPROPERTIES (
'COLUMN_STATS_ACCURATE'='false',
'numFiles'='0',
'numRows'='-1',
'rawDataSize'='-1',
'skip.header.line.count'='1',
'totalSize'='0',
'transient_lastDdlTime'='1503183208') but when quering this same table from Impala, impala throws an error : AnalysisException: Failed to load metadata for table: 'amt_multi' CAUSED BY: TableLoadingException: Failed to load metadata for table: amt_multi CAUSED BY: InvalidStorageDescriptorException: Invalid delimiter: '<|>'. Delimiter must be specified as a single character or as a decimal value in the range [-128:127] So my question is can impala support multi character delimiter for text type data ? And if so how does one do this. Thanks
... View more
Labels:
- Labels:
-
Apache Impala
-
HDFS