Support Questions

zack_riesland · ‎11-16-2016

I am trying to accomplish something like this:

insert into table daily_counts

values

(

select count(*) from table_a,

select count(*) from table_b

)

...etc.

I know the syntax works for known values. How can I enter the result of a sub-query?

I currently accomplish this in java via jdbc, by running all the queries individually, and then parsing together the final insert statement.

But it MUST be possible with HiveQL, right?

bpreachuk · ‎11-16-2016

Hi @Zack Riesland. Here's how I normally do this. It's not specifically a subquery but accomplishes what you're looking for.

insert into table daily_counts
select count(*), 'table_a' from table_a
   UNION 
select count(*), 'table_b' from table_b
   UNION
select count(*), 'table_c' from table_c
...
;

View solution in original post

bpreachuk · ‎11-16-2016

Hi @Zack Riesland. Here's how I normally do this. It's not specifically a subquery but accomplishes what you're looking for.

insert into table daily_counts
select count(*), 'table_a' from table_a
   UNION 
select count(*), 'table_b' from table_b
   UNION
select count(*), 'table_c' from table_c
...
;

bpreachuk · ‎11-16-2016

Edited the post to fix syntax error. Now it runs just fine. 😉

zack_riesland · ‎11-18-2016

Thanks, @bpreachuk

Interestingly, I think we found a major Hive bug.

When I run a query like the one above, it appears to enter an infinite loop.

I even pared my data set down to 3 columns - counting 3 tables.

And it appears to spawn MR jobs indefinitely.

I worked around it. But someone should look at the UNION operation.

Happy Friday!

bpreachuk · ‎11-18-2016

Yikes! Have never seen that happen before, but I certainly have no reason to doubt you. Does it happen with hive.execution.engine=tez? If you could grab & sanitize your query/config details & post that as a Hive bug in Jira it would be greatly appreciated... we don't want that problem to bite anyone else.

Cloudera Community

Support Questions

How to insert individual rows into hive based on subqueries