@Nilesh Shrimant Try to create table in parquet format , and set this config set hive.fetch.task.conversion=more;
https://issues.apache.org/jira/browse/HIVE-11785
hive> create table repo (lvalue int, charstring string) stored as parquet;
OK
Time taken: 0.34 seconds
hive> load data inpath '/tmp/repo/test.parquet' overwrite into table repo;
Loading data to table default.repo
chgrp: changing ownership of 'hdfs://nameservice1/user/hive/warehouse/repo/test.parquet': User does not belong to hive
Table default.repo stats: [numFiles=1, numRows=0, totalSize=610, rawDataSize=0]
OK
Time taken: 0.732 seconds
hive> set hive.fetch.task.conversion=more;
hive> select * from repo;
Option 2:
There is some info here: http://stackoverflow.com/questions/26339564/handling-newline-character-in-hive
Records in Hive are hard-coded to be terminated by the newline character (even though there is a LINES TERMINATED BY
clause, it is not implemented).
- Write a custom
InputFormat
that uses a RecordReader
that understands non-newline delimited records. Look at the code for LineReader
/LineRecordReader
and TextInputFormat
. - Use a format other than text/ASCII, like Parquet. I would recommend this regardless, as text is probably the worst format you can store data in anyway.