Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

It is possible insert data into a table A from another B by subquery in HIVE?

Explorer

Hello,

I want execute the follow sql :

INSERT INTO TABLE db_h_gss.tb_h_teste_insert values( teste_2, teste_3, teste_1, PARTITION (cod_index=1) ) from SELECT CAST(teste_1 AS VARCHAR(200)), CAST(teste_2 AS VARCHAR(200)), CAST(teste_3 AS VARCHAR(200)), FROM db_h_gss.tb_h_teste_orig limit 10;

It's possible, insert with subquery? This command doesnt work, i tried many ways and none worked.

This is the create table (destination)

CREATE TABLE db_h_gss.tb_h_teste_insert (teste_1 VARCHAR(200), teste_2 VARCHAR(200), teste_3 VARCHAR(200) ) PARTITIONED BY(cod_index int ) STORED AS ORC;

Thanks

14 REPLIES 14

Yup possible. Remove the VALUES, remove the FROM:

Syntax is:

INSERT INTO TABLE xxx partiton ( xxx ) SELECT xxx;

You don't need to specify any columns or data types. It just needs to fit to your target table. ( all columns need to match, the only potential issue is the partition column. For example if you specify a specific partition you cannot have the partition column in the select clause, if you specify dynamic partitioning you need the partition columns )

You mixed up the from query and values parts.

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Insertingdatai...

Explorer

Thanks Leonard, but, that is my problem, im looking a way to insert specifying the fields\columns in the target table.

It is to improve the display command.

The idea is to show that:

Column A in target table corresponds to column A in table orig

Thanks again.

The columns and orders just need to match. Not sure about the question.

Let's assume

Table A ( daily data )

col a, col b, col c, col dt

Table B ( fact table with subset of columns reordered )

col c, col a partitioned by dt

So if you know the date is only today you can just select the columns you need in the order you need them:

INSERT INTO B PARTITION (dt=1805 ) SELECT c,a from A;

or if you want dynamic partitoning you need to add the partition column at the end:

INSERT INTO B PARTITION (dt ) SELECT c,a,dt from A;

Explorer

I already did the insert, this command worked :

INSERT OVERWRITE TABLE db_h_gss.tb_h_teste_insert PARTITION (cod_index=1) SELECT CAST(teste_1 AS VARCHAR(200)), CAST(teste_2 AS VARCHAR(200)), CAST(teste_3 AS VARCHAR(200)) FROM db_h_gss.tb_h_teste_orig limit 100;

But, now, im looking another way to insert data in target table(tb_h_teste_insert), where I show the fields\columns in the target table and the fields\columns corresponding of the source table.

Thanks again.

"But, now, im looking another way to insert data in target table(tb_h_teste_insert), where I show the fields\columns in the target table and the fields\columns corresponding of the source table.

I didn't understand the question at all. What do you want to do exactly? Quizas mejor en espanol?

Expert Contributor

Explorer

i will try explain what i want.

case 1:

INSERT INTO TABLE db_h_gss.tb_h_teste_insert

PARTITION (cod_index=1)

SELECT

CAST(teste_1 AS VARCHAR(200)), CAST(teste_2 AS VARCHAR(200)), CAST(teste_3 AS VARCHAR(200))

FROM db_h_gss.tb_h_teste_orig;

This command works, but i cant see which column "teste_1"(tb_h_teste_orig) corresponds in target table(tb_h_teste_insert)

case 2 :

INSERT INTO TABLE db_h_gss.tb_h_teste_insert

values(

teste_1, teste_2, teste_3, PARTITION (cod_index=1)

)

from

SELECT

CAST(teste_1 AS VARCHAR(200)),

CAST(teste_2 AS VARCHAR(200)),

CAST(teste_3 AS VARCHAR(200))

FROM db_h_gss.tb_h_teste_orig ;

This command doesnt work, but, you can see that i'm trying insert in "teste_1"(target table) the column "teste_1" from tb_h_teste_orig.

Sorry my english, but my spanish is worse.

thanks again

haha, no worries. I still don't get the question, if you were missing some of the columns in the target table Eugene's link would help. But you don't. You have all of the columns of the target table so I do not get what you are trying to achieve. Your query already works no? If you need to change the order of columns you can just specify them in a different order?

So what exactly is better in your first query than in your second query apart from adding additional redundant information?

Explorer

Somehow, thats it i want, redundancy, only to improve the display command.

Im reading the Eugene's link right now.

Yeah I still don't get what you get out of it, you could just add a comment to it with the column names of the target table. But I don't see any functional enhancement.

Regarding Eugene's link I just tried it, essentially the trick is if your target table has more columns than your select and you want to set the remainder null you can now select a subset of columns in the target table. Like:

insert into test30 (id) select * from test31;

insert into test34 (id_target) select id from test31;

In this case any additional column in the target table is null.

However I just tried it with partitioned tables and it doesn't seem to work for me. Looks like a bug. You could open a support case if that doesn't work. However for your usecase where you have all columns in both tables I really don't see the point.

insert into table test32(id) partition(dt=3) select * from test31;

NoViableAltException(192@[])

at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:41226)

at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40421)

...

Expert Contributor

it should be insert into table test32 partition(dt=3) (id) select * from test31;

ah thanks.

Explorer

Well, i give up, i will use the command that works....heheh

thanks to all...

New Contributor

Hello! The WITH HQL clause (inherited from SQL) may help you be a bit more explicit in your transformations and accomplish something similar to your case 2:

WITH TABLE1 
AS
(
	SELECT 
	cod_index,
	CAST(teste_1 AS VARCHAR(200)), 
	CAST(teste_2 AS VARCHAR(200)), 
	CAST(teste_3 AS VARCHAR(200))
	FROM db_h_gss.tb_h_teste_orig
)
INSERT INTO TABLE db_h_gss.tb_h_teste_insert PARTITION (cod_index = 1)
SELECT
	teste_1,
	teste_2,
	teste_3
FROM TABLE1 WHERE cod_index = 1;
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.