Support Questions
Find answers, ask questions, and share your expertise

How to insert data into this table

Solved Go to solution
Highlighted

Re: How to insert data into this table

Explorer

so is there no way to load the data automatically from the files coming into a particular directory in hdfs?

Highlighted

Re: How to insert data into this table

Expert Contributor

you can create external table with location & write the text files directly on that path.

eg., create external table staging1(id struct<tid:string,action:string,createdts:timestamp>, cid string, anumber string) row format delimited fields terminated by ',' collection items terminated by '|' stored as textfile LOCATION '/tmp/staging/';

All text files can be directly written at /tmp/staging/ by kafka or flume

If Kafka or flume will be able to generate Avro files, then you can skip the staging table & create external avro table & write avro files directly on to external table location.

Highlighted

Re: How to insert data into this table

Explorer

naresh, you are the man!!! thanks so much!!!

Highlighted

Re: How to insert data into this table

Explorer

any idea what is wrong with this :

CREATE EXTERNAL TABLE staging3

ROW FORMAT SERDE 'org.apache.hadoop.hive.serd2.avro.AvroSerDe'

STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'

OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'

TBLPROPERTIES ('avro.schema.url'='hdfs:///tmp/avroschemas/testtbl.json')

LOCATION '/tmp/staging';

I am getting :

FAILED: ParseException line 7:0 missing EOF at 'LOCATION' near ')'

Highlighted

Re: How to insert data into this table

Expert Contributor

@n c

its because that TBLPROPERTIES should the last one, use the below and it should help:

+++++++++++++++++

CREATE EXTERNAL TABLE staging3

ROW FORMAT SERDE 'org.apache.hadoop.hive.serd2.avro.AvroSerDe'

STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'

OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'

LOCATION '/tmp/staging'

TBLPROPERTIES ('avro.schema.url'='hdfs:///tmp/avroschemas/testtbl.json');

+++++++++++++++++

Highlighted

Re: How to insert data into this table

Explorer

so with a table created as above, how should the data be to be able to load it in? in what format that is? because i am not specifying any delimiters etc. appreciate the insights.

Highlighted

Re: How to insert data into this table

Explorer

@Mahesh Balakrishnan @Naresh P R @Shu

can i have feedback on how the data should be formatted to be loaded (load inpath command) into a table created as :

CREATE EXTERNAL TABLE staging3

ROW FORMAT SERDE 'org.apache.hadoop.hive.serd2.avro.AvroSerDe'

STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'

OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'

LOCATION '/tmp/staging'

TBLPROPERTIES ('avro.schema.url'='hdfs:///tmp/avroschemas/testtbl.json');

The schema is the same as described earlier. But there is no delimiter specified.

Appreciate if you could provide a sample piece of data for this.

Re: How to insert data into this table

Expert Contributor

There are multiple ways to populate avro tables.

1) Insert into avro_table values(<col1>,<col2>..,<colN>) -- This way hive will write avro files.

2) Generating avro files & copying directly to '/tmp/staging', You can read avro documentation to write avro files directly into hdfs path. Avro Reader/Writer APIs will take care of storing & retrieving records, we don't need to explicitly specify delimiters for avro files.

Don't have an account?