Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

import csv data into hive table orc format

avatar
Expert Contributor

Hello,

Is it possible to import data from a CSV file into a hive table the orc format?

Thank

1 ACCEPTED SOLUTION

avatar
@alain TSAFACK

You can load the data from csv file to a temp hive table with same structure as orc table, then insert the data into orc table as:

insert into table table_orc as select * from table_textfile;

Thanks and Regards,

Sindhu

View solution in original post

6 REPLIES 6

avatar
@alain TSAFACK

You can load the data from csv file to a temp hive table with same structure as orc table, then insert the data into orc table as:

insert into table table_orc as select * from table_textfile;

Thanks and Regards,

Sindhu

avatar
Expert Contributor

Thank you. but I would go directly from the csv file to the hive orc table format without creating the textfile data.

Thank

avatar
Super Collaborator

Hi @alain

One more way:

3 Step Method

Step 1: You can create a external table pointing to an HDFS location conforming to the schema of your csv file. You can drop the csv file(s) into the external table location.

Step 2: Create a managed Hive table with ORC format.

Step 3: Do Insert into Managed table select from External table. ( Once the records are copied, delete the files from the external directory)

This process can be automated using scripting via oozie or cron. I have used this to do mass batch ingestion.

More recent way of doing this is using Apache Nifi with Hive table processor, makes life much more simpler..:). If you want to read about Nifi please go to

http://hortonworks.com/products/hdf/

Thanks

Satish

avatar
Expert Contributor

@alain TSAFACK Ambari Hive Views provide this feature (Upload Table) where you can directly upload CSV file into an ORC Hive table.( It takes care internally the 2 step process to create ORC table)

avatar
Explorer

when uploading a CSV file containing "\N", I simply get the string value "N" instead of NULL in hive

is there someone help to solve it ?

https://github.com/ogrodnek/csv-serde/issues/15

,

when uploading a CSV file containing "\N", I simply get the string value "N" instead of NULL in hive

is there someone to solve it?

https://github.com/ogrodnek/csv-serde/issues/15

avatar
Explorer

I have wrote a hard code in class org.apache.hadoop.hive.serde2.OpenCSVSerde, but it doesn't work when I replace the old jar "/usr/hdp/current/hive-client/lib/hive-serde-1.2.1.2.3.0.0-2557.jar". what should I do to make the new jar work?

@Override
public Object deserialize(final Writable blob) throws SerDeException {
  Text rowText = (Text) blob;
  String text = rowText.toString().replace("\\N","\"\"");
  CSVReader csv = null;
  try {
    csv = newReader(new CharArrayReader(text.toCharArray()), separatorChar,
            quoteChar, escapeChar);