Support Questions

nilesh_shrimant · ‎02-01-2017

dchiguruvad · ‎02-01-2017

@Nilesh Shrimant Try to create table in parquet format , and set this config set hive.fetch.task.conversion=more;

https://issues.apache.org/jira/browse/HIVE-11785

hive> create table repo (lvalue int, charstring string) stored as parquet;
OK
Time taken: 0.34 seconds
hive> load data inpath '/tmp/repo/test.parquet' overwrite into table repo;
Loading data to table default.repo
chgrp: changing ownership of 'hdfs://nameservice1/user/hive/warehouse/repo/test.parquet': User does not belong to hive
Table default.repo stats: [numFiles=1, numRows=0, totalSize=610, rawDataSize=0]
OK
Time taken: 0.732 seconds
hive> set hive.fetch.task.conversion=more;
hive> select * from repo;

Option 2:

There is some info here: http://stackoverflow.com/questions/26339564/handling-newline-character-in-hive

Records in Hive are hard-coded to be terminated by the newline character (even though there is a LINES TERMINATED BY clause, it is not implemented).

Write a custom InputFormat that uses a RecordReader that understands non-newline delimited records. Look at the code for LineReader/LineRecordReader and TextInputFormat.
Use a format other than text/ASCII, like Parquet. I would recommend this regardless, as text is probably the worst format you can store data in anyway.

Cloudera Community

Support Questions

How to extract Hive query output where we have new line character in a column?

How to Extract All Hive Tables DDL

Processing Files in Hive Using Native (Non-UTF8) C...

Write Spark HQL Query output to HDFS

Replace Line Feed character in Nifi

Extract text between two characters

Extract a column from csv

Optimizing Hive queries for ORC formatted tables

Hive insert query optimization

Hive metastore DB Connection verification from Com...

Spark (PySpark) to extract from SQL Server