Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Create Hive tables from CSV files

Create Hive tables from CSV files

Contributor

Hi All,

I have been creating Hive tables from CSV files manually copying the column names and pasting in a Hive create table script.

However I have at least 300 CSV files. I don want to repeat the same process for 300 times. Is there anyway I can autmatically create hive table creation script using the column headers as column names?

Thanks in advance

5 REPLIES 5
Highlighted

Re: Create Hive tables from CSV files

Contributor

@dhieru singh is your CSV file have the same format? if yes, you might consider to use

CREATE EXTERNAL TABLE table_name (
colA ...
colB..
) 
partitioned by .. 
location ... 

msck repair table table_name
alter table table_name add partition ...

every time you have a new file you just run and generate

msck repair table table_name
alter table table_name add partition ...

Re: Create Hive tables from CSV files

Contributor

@mel mendoza Thanks for the help. Unfortunately all the CSV files have different columns.

Re: Create Hive tables from CSV files

Super Guru

@dhieru singh,

Another way is, Use Ambari and click on HiveView as show in the below screenshot.

40627-ambari-view.png

then click on UploadTable and if your csv file is in local then click on choose file

40628-upload.png

if you want to get column names from headers then click on the gear symbol after Filetype dropdown

40629-header.png

The table will gets all the column names from csv file headers.

Select the database where do you want to create the table and change the table name if you want to change.

Then click on UploadTable button located at left on the screen.

Re: Create Hive tables from CSV files

Contributor

@Shu Thanks let me try it and will update you Thanks again

Re: Create Hive tables from CSV files

Contributor

I am facing the same issue right now. My current approach is to have a script to read the first line of all my csv files and then transform the first-line text into create sql statement.

I wonder whether this is some built-in solution from HDP.