Created 03-03-2016 04:30 PM
sample file(1 line):-
DIF|2005||CC|Customer Connect|
Table creation:-
STORED AS TEXTFILE location 'hdfs:///user/test';
what to do solve this Error?
Created 03-03-2016 04:48 PM
What is the error you are receiving ? For any text file separated by 'I' you can use following properties while creating Hive table
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
and remove
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ( "separatorChar" = " ", "quoteChar" = '"', "escapeChar" = "\\" )
That should do the trick.
Created 03-03-2016 04:56 PM
> ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|'
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> STORED AS TEXTFILE
> location 'hdfs:///sandbox/vpanshe';
FAILED: ParseException line 59:1 missing EOF at 'STORED' near ''org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat''
Created 03-03-2016 07:10 PM
Can you post your complete create table statement ?
Created 03-03-2016 08:26 PM
@mike pal Can you try following statement ?
CREATE EXTERNAL TABLE IF NOT EXISTS db.tablename( SOURCE_ID VARCHAR(30) , SOURCE_ID_TYPE VARCHAR(30) , SOURCE_NAME VARCHAR(30) , DEVICE_ID_1 VARCHAR(30) , DEVICE_ID_TYPE_1 VARCHAR(30) , DEVICE_ID_2 VARCHAR(30) , DEVICE_ID_TYPE_2 VARCHAR(30) , EVENT_TYPE VARCHAR(30) , EVENT_NAME VARCHAR(100) , EVENT_IDENTIFIER VARCHAR(30) , OCCURRENCE_TIME TIMESTAMP , DETECTION_TIME TIMESTAMP , REQUEST_ID VARCHAR(256) , TRANSACTION_ID VARCHAR(100) , HOSTNAME VARCHAR(30) , CATEGORY VARCHAR(30) , CHANNEL_LANG_TYPE VARCHAR(30) , ACCESS_ID VARCHAR(30) , ACCESS_TYPE VARCHAR(30) , MULTI_FACTOR_AUTHENTICATION_INDICATOR_1 VARCHAR(30) , INVOLVED_PARTY_ID_1 VARCHAR(30) , CARD_NUMBER_1 VARCHAR(30) , INVOLVED_PARTY_TYPE_1 VARCHAR(30) , CARD_ACCESS_FACILITY_ARRANGEMENT_TYPE_1 VARCHAR(30) , INVOLVED_PARTY_ROLE_TYPE_1 VARCHAR(30) , AUTHENTICATION_TYPE_1 VARCHAR(30) , INVOLVED_PARTY_ACTIVE_DIRECTORY_ID_1 VARCHAR(30) , MULTI_FACTOR_AUTHENTICATION_INDICATOR_2 VARCHAR(30) , INVOLVED_PARTY_ID_2 VARCHAR(30) , CARD_NUMBER_2 VARCHAR(30) , INVOLVED_PARTY_TYPE_2 VARCHAR(30) , CARD_ACCESS_FACILITY_ARRANGEMENT_TYPE_2 VARCHAR(30) , INVOLVED_PARTY_ROLE_TYPE_2 VARCHAR(30) , AUTHENTICATION_TYPE_2 VARCHAR(30) , INVOLVED_PARTY_ACTIVE_DIRECTORY_ID_2 VARCHAR(30) , CARD_NUMBER VARCHAR(30) , CUSTOMER_ID VARCHAR(30) , CUSTOMER_NAME VARCHAR(100) , CREDENTIAL_TYPE VARCHAR(30) , AUTHENTICATION_METHOD_TYPE VARCHAR(30) , AUTHENTICATION_TIMESTAMP TIMESTAMP , ELECTRONIC_DELIVERY_DEVICE_TYPE VARCHAR(30) , ELECTRONIC_DELIVERY_DEVICE_ID VARCHAR(30) , OPERATOR_ID VARCHAR(8) , OPERATOR_NAME VARCHAR(30) , BRANCH_NUMBER VARCHAR(8) , SOURCE_SYSTEM_ID VARCHAR(30) , SOURCE_SYSTEM_CODE VARCHAR(30) , SOURCE_SYSTEM_NAME VARCHAR(60) , RANK_NUMBER VARCHAR(30)
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
STORED AS TEXTFILE
location 'hdfs:///user/test';
Created 03-03-2016 09:01 PM
Remove the new line characters in the end. I.e. merge the last three lines. I know it sounds stupid but Hive seems to have a bug sometimes that throws EOF errors.
I could run the following:
CREATE EXTERNAL TABLE IF NOT EXISTS tablename( SOURCE_ID VARCHAR(30) , SOURCE_ID_TYPE VARCHAR(30) , SOURCE_NAME VARCHAR(30) , DEVICE_ID_1 VARCHAR(30) , DEVICE_ID_TYPE_1 VARCHAR(30) , DEVICE_ID_2 VARCHAR(30) , DEVICE_ID_TYPE_2 VARCHAR(30) , EVENT_TYPE VARCHAR(30) , EVENT_NAME VARCHAR(100) , EVENT_IDENTIFIER VARCHAR(30) , OCCURRENCE_TIME TIMESTAMP , DETECTION_TIME TIMESTAMP , REQUEST_ID VARCHAR(256) , TRANSACTION_ID VARCHAR(100) , HOSTNAME VARCHAR(30) , CATEGORY VARCHAR(30) , CHANNEL_LANG_TYPE VARCHAR(30) , ACCESS_ID VARCHAR(30) , ACCESS_TYPE VARCHAR(30) , MULTI_FACTOR_AUTHENTICATION_INDICATOR_1 VARCHAR(30) , INVOLVED_PARTY_ID_1 VARCHAR(30) , CARD_NUMBER_1 VARCHAR(30) , INVOLVED_PARTY_TYPE_1 VARCHAR(30) , CARD_ACCESS_FACILITY_ARRANGEMENT_TYPE_1 VARCHAR(30) , INVOLVED_PARTY_ROLE_TYPE_1 VARCHAR(30) , AUTHENTICATION_TYPE_1 VARCHAR(30) , INVOLVED_PARTY_ACTIVE_DIRECTORY_ID_1 VARCHAR(30) , MULTI_FACTOR_AUTHENTICATION_INDICATOR_2 VARCHAR(30) , INVOLVED_PARTY_ID_2 VARCHAR(30) , CARD_NUMBER_2 VARCHAR(30) , INVOLVED_PARTY_TYPE_2 VARCHAR(30) , CARD_ACCESS_FACILITY_ARRANGEMENT_TYPE_2 VARCHAR(30) , INVOLVED_PARTY_ROLE_TYPE_2 VARCHAR(30) , AUTHENTICATION_TYPE_2 VARCHAR(30) , INVOLVED_PARTY_ACTIVE_DIRECTORY_ID_2 VARCHAR(30) , CARD_NUMBER VARCHAR(30) , CUSTOMER_ID VARCHAR(30) , CUSTOMER_NAME VARCHAR(100) , CREDENTIAL_TYPE VARCHAR(30) , AUTHENTICATION_METHOD_TYPE VARCHAR(30) , AUTHENTICATION_TIMESTAMP TIMESTAMP , ELECTRONIC_DELIVERY_DEVICE_TYPE VARCHAR(30) , ELECTRONIC_DELIVERY_DEVICE_ID VARCHAR(30) , OPERATOR_ID VARCHAR(8) , OPERATOR_NAME VARCHAR(30) , BRANCH_NUMBER VARCHAR(8) , SOURCE_SYSTEM_ID VARCHAR(30) , SOURCE_SYSTEM_CODE VARCHAR(30) , SOURCE_SYSTEM_NAME VARCHAR(60) , RANK_NUMBER VARCHAR(30)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE location 'hdfs:///user/test';
Created 03-13-2016 03:59 AM
@mike pal.. there shouldnt be any need of specifying INPUTFORMAT and OUTFORMAT, you can simply avoid this extra work and just use STORED AS TEXTFILE to expose the text file in HIVE. in most cases TEXTFILE is the default file format, unless the configuration parameterhive.default.fileformat has a different setting...