Support Questions

Find answers, ask questions, and share your expertise

import csv data into hive table orc format

avatar
Expert Contributor

Hello,

Is it possible to import data from a CSV file into a hive table the orc format?

Thank

1 ACCEPTED SOLUTION

avatar
@alain TSAFACK

You can load the data from csv file to a temp hive table with same structure as orc table, then insert the data into orc table as:

insert into table table_orc as select * from table_textfile;

Thanks and Regards,

Sindhu

View solution in original post

6 REPLIES 6

avatar
@alain TSAFACK

You can load the data from csv file to a temp hive table with same structure as orc table, then insert the data into orc table as:

insert into table table_orc as select * from table_textfile;

Thanks and Regards,

Sindhu

avatar
Expert Contributor

Thank you. but I would go directly from the csv file to the hive orc table format without creating the textfile data.

Thank

avatar
Super Collaborator

Hi @alain

One more way:

3 Step Method

Step 1: You can create a external table pointing to an HDFS location conforming to the schema of your csv file. You can drop the csv file(s) into the external table location.

Step 2: Create a managed Hive table with ORC format.

Step 3: Do Insert into Managed table select from External table. ( Once the records are copied, delete the files from the external directory)

This process can be automated using scripting via oozie or cron. I have used this to do mass batch ingestion.

More recent way of doing this is using Apache Nifi with Hive table processor, makes life much more simpler..:). If you want to read about Nifi please go to

http://hortonworks.com/products/hdf/

Thanks

Satish

avatar
Expert Contributor

@alain TSAFACK Ambari Hive Views provide this feature (Upload Table) where you can directly upload CSV file into an ORC Hive table.( It takes care internally the 2 step process to create ORC table)

avatar
Explorer

when uploading a CSV file containing "\N", I simply get the string value "N" instead of NULL in hive

is there someone help to solve it ?

https://github.com/ogrodnek/csv-serde/issues/15

,

when uploading a CSV file containing "\N", I simply get the string value "N" instead of NULL in hive

is there someone to solve it?

https://github.com/ogrodnek/csv-serde/issues/15

avatar
Explorer

I have wrote a hard code in class org.apache.hadoop.hive.serde2.OpenCSVSerde, but it doesn't work when I replace the old jar "/usr/hdp/current/hive-client/lib/hive-serde-1.2.1.2.3.0.0-2557.jar". what should I do to make the new jar work?

@Override
public Object deserialize(final Writable blob) throws SerDeException {
  Text rowText = (Text) blob;
  String text = rowText.toString().replace("\\N","\"\"");
  CSVReader csv = null;
  try {
    csv = newReader(new CharArrayReader(text.toCharArray()), separatorChar,
            quoteChar, escapeChar);