Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Load several files into HIVE table

avatar
Expert Contributor

Look I'm trying to analyze too many files into just one HIVE table. Key insights, I'm working with json files and the tables structure is :

CREATE EXTERNAL TABLE test1 (

STATIONS ARRAY<STRING>,

SCHEMESUSPENDED STRING,

TIMELOAD TIMESTAMP )

ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'

LOCATION '/user/andres/hive/bixihistorical/';

I need to load around 50 files with the same structure all of them. I have tried things like:

LOAD DATA INPATH '/user/andres/datasets/bixi2017/*.json' OVERWRITE INTO TABLE test1;

LOAD DATA INPATH '/user/andres/datasets/bixi2017/*' OVERWRITE INTO TABLE test1;

LOAD DATA INPATH '/user/andres/datasets/bixi2017/' OVERWRITE INTO TABLE test1;

Any of those above have worked, any idea guys about how should I go thru? thanks so much

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hi Guys, I'm so so .... Well, I just remember that you can create just an external table stored in the same folder all files with the same structure are located. So , in that way I will load whole records in one shoot.

> CREATE EXTERNAL TABLE bixi_his

> ( > STATIONS ARRAY<STRUCT<id: INT,s:STRING,n:string,st:string,b:string,su:string,m:string,lu:string,lc:string,bk:string,bl:string,la:float,lo:float,da:int,dx:int,ba:int,bx:int>>, > SCHEMESUSPENDED STRING,

> TIMELOAD BIGINT > )

> ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'

> LOCATION '/user/ingenieroandresangel/datasets/bixi2017/';

thanks

View solution in original post

1 REPLY 1

avatar
Expert Contributor

Hi Guys, I'm so so .... Well, I just remember that you can create just an external table stored in the same folder all files with the same structure are located. So , in that way I will load whole records in one shoot.

> CREATE EXTERNAL TABLE bixi_his

> ( > STATIONS ARRAY<STRUCT<id: INT,s:STRING,n:string,st:string,b:string,su:string,m:string,lu:string,lc:string,bk:string,bl:string,la:float,lo:float,da:int,dx:int,ba:int,bx:int>>, > SCHEMESUSPENDED STRING,

> TIMELOAD BIGINT > )

> ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'

> LOCATION '/user/ingenieroandresangel/datasets/bixi2017/';

thanks