Bulk INSERT into Hive 2.x with ACID turned on

Hi, It is great to be able to perform CRUD operations on Hive. I have noticed that it takes an average of a few seconds to insert a single row into Hive. I am sure in a real life scenario, data will be flowing in at a much faster rate in a streaming or in an OLTP or OLAP scenario. Is there a way to bulk insert data into Hive, is there a concept of batching. In other words how can we insert a bunch of records in one TEZ job.


Expert Contributor

you can stage your data somewhere and use "Insert into AcidTable Select * from ..."

If the data originates in some streaming fashion then Streaming Ingest may be appropriate - this has been integrated with Stork, Flume and NiFi.

Load Data statement is not supported in 2.x.

