Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

I am making hive external table from CSV file delimiter as "|" but I am getting error can you help me out of this?

avatar
Rising Star

sample file(1 line):-

DIF|2005||CC|Customer Connect|

Table creation:-

 STORED AS TEXTFILE
 location 'hdfs:///user/test';

what to do solve this Error?

6 REPLIES 6

avatar
@mike pal

What is the error you are receiving ? For any text file separated by 'I' you can use following properties while creating Hive table

STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'

and remove

ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES ( "separatorChar" = " ", "quoteChar" = '"', "escapeChar" = "\\" )

That should do the trick.

avatar
Rising Star

> ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|'

> STORED AS INPUTFORMAT

> 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'

> STORED AS TEXTFILE

> location 'hdfs:///sandbox/vpanshe';

FAILED: ParseException line 59:1 missing EOF at 'STORED' near ''org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat''

avatar
@mike pal

Can you post your complete create table statement ?

avatar

@mike pal Can you try following statement ?

CREATE EXTERNAL TABLE IF NOT EXISTS db.tablename( SOURCE_ID VARCHAR(30) , SOURCE_ID_TYPE VARCHAR(30) , SOURCE_NAME VARCHAR(30) , DEVICE_ID_1 VARCHAR(30) , DEVICE_ID_TYPE_1 VARCHAR(30) , DEVICE_ID_2 VARCHAR(30) , DEVICE_ID_TYPE_2 VARCHAR(30) , EVENT_TYPE VARCHAR(30) , EVENT_NAME VARCHAR(100) , EVENT_IDENTIFIER VARCHAR(30) , OCCURRENCE_TIME TIMESTAMP , DETECTION_TIME TIMESTAMP , REQUEST_ID VARCHAR(256) , TRANSACTION_ID VARCHAR(100) , HOSTNAME VARCHAR(30) , CATEGORY VARCHAR(30) , CHANNEL_LANG_TYPE VARCHAR(30) , ACCESS_ID VARCHAR(30) , ACCESS_TYPE VARCHAR(30) , MULTI_FACTOR_AUTHENTICATION_INDICATOR_1 VARCHAR(30) , INVOLVED_PARTY_ID_1 VARCHAR(30) , CARD_NUMBER_1 VARCHAR(30) , INVOLVED_PARTY_TYPE_1 VARCHAR(30) , CARD_ACCESS_FACILITY_ARRANGEMENT_TYPE_1 VARCHAR(30) , INVOLVED_PARTY_ROLE_TYPE_1 VARCHAR(30) , AUTHENTICATION_TYPE_1 VARCHAR(30) , INVOLVED_PARTY_ACTIVE_DIRECTORY_ID_1 VARCHAR(30) , MULTI_FACTOR_AUTHENTICATION_INDICATOR_2 VARCHAR(30) , INVOLVED_PARTY_ID_2 VARCHAR(30) , CARD_NUMBER_2 VARCHAR(30) , INVOLVED_PARTY_TYPE_2 VARCHAR(30) , CARD_ACCESS_FACILITY_ARRANGEMENT_TYPE_2 VARCHAR(30) , INVOLVED_PARTY_ROLE_TYPE_2 VARCHAR(30) , AUTHENTICATION_TYPE_2 VARCHAR(30) , INVOLVED_PARTY_ACTIVE_DIRECTORY_ID_2 VARCHAR(30) , CARD_NUMBER VARCHAR(30) , CUSTOMER_ID VARCHAR(30) , CUSTOMER_NAME VARCHAR(100) , CREDENTIAL_TYPE VARCHAR(30) , AUTHENTICATION_METHOD_TYPE VARCHAR(30) , AUTHENTICATION_TIMESTAMP TIMESTAMP , ELECTRONIC_DELIVERY_DEVICE_TYPE VARCHAR(30) , ELECTRONIC_DELIVERY_DEVICE_ID VARCHAR(30) , OPERATOR_ID VARCHAR(8) , OPERATOR_NAME VARCHAR(30) , BRANCH_NUMBER VARCHAR(8) , SOURCE_SYSTEM_ID VARCHAR(30) , SOURCE_SYSTEM_CODE VARCHAR(30) , SOURCE_SYSTEM_NAME VARCHAR(60) , RANK_NUMBER VARCHAR(30)

)

ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'

STORED AS TEXTFILE

location 'hdfs:///user/test';

avatar
Master Guru

Remove the new line characters in the end. I.e. merge the last three lines. I know it sounds stupid but Hive seems to have a bug sometimes that throws EOF errors.

I could run the following:

CREATE EXTERNAL TABLE IF NOT EXISTS tablename( SOURCE_ID VARCHAR(30) , SOURCE_ID_TYPE VARCHAR(30) , SOURCE_NAME VARCHAR(30) , DEVICE_ID_1 VARCHAR(30) , DEVICE_ID_TYPE_1 VARCHAR(30) , DEVICE_ID_2 VARCHAR(30) , DEVICE_ID_TYPE_2 VARCHAR(30) , EVENT_TYPE VARCHAR(30) , EVENT_NAME VARCHAR(100) , EVENT_IDENTIFIER VARCHAR(30) , OCCURRENCE_TIME TIMESTAMP , DETECTION_TIME TIMESTAMP , REQUEST_ID VARCHAR(256) , TRANSACTION_ID VARCHAR(100) , HOSTNAME VARCHAR(30) , CATEGORY VARCHAR(30) , CHANNEL_LANG_TYPE VARCHAR(30) , ACCESS_ID VARCHAR(30) , ACCESS_TYPE VARCHAR(30) , MULTI_FACTOR_AUTHENTICATION_INDICATOR_1 VARCHAR(30) , INVOLVED_PARTY_ID_1 VARCHAR(30) , CARD_NUMBER_1 VARCHAR(30) , INVOLVED_PARTY_TYPE_1 VARCHAR(30) , CARD_ACCESS_FACILITY_ARRANGEMENT_TYPE_1 VARCHAR(30) , INVOLVED_PARTY_ROLE_TYPE_1 VARCHAR(30) , AUTHENTICATION_TYPE_1 VARCHAR(30) , INVOLVED_PARTY_ACTIVE_DIRECTORY_ID_1 VARCHAR(30) , MULTI_FACTOR_AUTHENTICATION_INDICATOR_2 VARCHAR(30) , INVOLVED_PARTY_ID_2 VARCHAR(30) , CARD_NUMBER_2 VARCHAR(30) , INVOLVED_PARTY_TYPE_2 VARCHAR(30) , CARD_ACCESS_FACILITY_ARRANGEMENT_TYPE_2 VARCHAR(30) , INVOLVED_PARTY_ROLE_TYPE_2 VARCHAR(30) , AUTHENTICATION_TYPE_2 VARCHAR(30) , INVOLVED_PARTY_ACTIVE_DIRECTORY_ID_2 VARCHAR(30) , CARD_NUMBER VARCHAR(30) , CUSTOMER_ID VARCHAR(30) , CUSTOMER_NAME VARCHAR(100) , CREDENTIAL_TYPE VARCHAR(30) , AUTHENTICATION_METHOD_TYPE VARCHAR(30) , AUTHENTICATION_TIMESTAMP TIMESTAMP , ELECTRONIC_DELIVERY_DEVICE_TYPE VARCHAR(30) , ELECTRONIC_DELIVERY_DEVICE_ID VARCHAR(30) , OPERATOR_ID VARCHAR(8) , OPERATOR_NAME VARCHAR(30) , BRANCH_NUMBER VARCHAR(8) , SOURCE_SYSTEM_ID VARCHAR(30) , SOURCE_SYSTEM_CODE VARCHAR(30) , SOURCE_SYSTEM_NAME VARCHAR(60) , RANK_NUMBER VARCHAR(30)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'   STORED AS TEXTFILE location 'hdfs:///user/test';

avatar
Contributor

@mike pal.. there shouldnt be any need of specifying INPUTFORMAT and OUTFORMAT, you can simply avoid this extra work and just use STORED AS TEXTFILE to expose the text file in HIVE. in most cases TEXTFILE is the default file format, unless the configuration parameterhive.default.fileformat has a different setting...