Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

hive vectorization union all problem

avatar
New Contributor

I am running a map only query like "select name from personel where name='aaa'" on a orc+zlib table on the cluster and if I check the EXPLAIN results I can see execution mode is "Vectorized".

If I run a simple map only union all query like "select name from personel where name='aaa' union all select name from personel where name='bbb'" then vectorization is not active for this query in the EXPLAIN results and query duration gets longer.

What might be the reason this result and what is the relation between union all and vectorization?

2 REPLIES 2

avatar
@Ozhan Gulen

There are certain known issues with respect to union all and tez vectorization. Could you share the execution plan for union all for more insight on the issue?

avatar
New Contributor

I'm facing same problem.

HDP 2.6.4 and Hive 2.1 LLAP

set hive.vectorized.execution.enabled=true
DROP TABLE TEMP_01; 
CREATE TEMPORARY TABLE TEMP_01 AS (SELECT 'TEST' AS D1, 'BBB' AS M1); 
SELECT D1 , M1 , COUNT(DISTINCT D1) AS M2 , COUNT(DISTINCT M1) AS M3 
FROM TEMP_01 GROUP BY D1 , M1 
UNION ALL 
SELECT D1 , 'XX' AS M1 , COUNT(DISTINCT D1) AS M2 , COUNT(DISTINCT M1) AS M3 
FROM TEMP_01 GROUP BY D1 , M1

Hive 2.1 LLAP raise exception below as

   java.lang.Exception: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.reflect.InvocationTargetException
java.lang.Exception: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.reflect.InvocationTargetException
	at org.apache.ambari.view.hive20.resources.jobs.JobService.getOne(JobService.java:147)
	at sun.reflect.GeneratedMethodAccessor552.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
	at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
	at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
	at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
	at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
	at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
	at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
	at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
	at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
	at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
	at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1507)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
	at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:118)
	at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:84)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.apache.ambari.server.security.authorization.AmbariAuthorizationFilter.doFilter(AmbariAuthorizationFilter.java:287)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:113)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:103)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:113)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:54)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:45)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.apache.ambari.server.security.authentication.AmbariDelegatingAuthenticationFilter.doFilter(AmbariDelegatingAuthenticationFilter.java:132)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.apache.ambari.server.security.authorization.AmbariUserAuthorizationFilter.doFilter(AmbariUserAuthorizationFilter.java:91)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:87)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:192)
	at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:160)
	at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:237)
	at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:167)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.apache.ambari.server.api.MethodOverrideFilter.doFilter(MethodOverrideFilter.java:72)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.apache.ambari.server.api.AmbariPersistFilter.doFilter(AmbariPersistFilter.java:47)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.apache.ambari.server.view.AmbariViewsMDCLoggingFilter.doFilter(AmbariViewsMDCLoggingFilter.java:54)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.apache.ambari.server.view.ViewThrottleFilter.doFilter(ViewThrottleFilter.java:161)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.apache.ambari.server.security.AbstractSecurityHeaderFilter.doFilter(AbstractSecurityHeaderFilter.java:125)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.apache.ambari.server.security.AbstractSecurityHeaderFilter.doFilter(AbstractSecurityHeaderFilter.java:125)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82)
	at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:294)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:427)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
	at org.apache.ambari.server.controller.AmbariHandlerList.processHandlers(AmbariHandlerList.java:212)
	at org.apache.ambari.server.controller.AmbariHandlerList.processHandlers(AmbariHandlerList.java:201)
	at org.apache.ambari.server.controller.AmbariHandlerList.handle(AmbariHandlerList.java:150)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
	at org.eclipse.jetty.server.Server.handle(Server.java:370)
	at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
	at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:973)
	at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1035)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:641)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:231)
	at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.reflect.InvocationTargetException
	at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:264)
	at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:250)
	at org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:309)
	at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:250)
	at org.apache.ambari.view.hive20.HiveJdbcConnectionDelegate.execute(HiveJdbcConnectionDelegate.java:49)
	at org.apache.ambari.view.hive20.actor.StatementExecutor.runStatement(StatementExecutor.java:91)
	at org.apache.ambari.view.hive20.actor.StatementExecutor.handleMessage(StatementExecutor.java:72)
	at org.apache.ambari.view.hive20.actor.HiveActor.onReceive(HiveActor.java:38)
	at akka.actor.UntypedActor$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
	at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
	at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
	at akka.actor.ActorCell.invoke(ActorCell.scala:487)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
	at akka.dispatch.Mailbox.run(Mailbox.scala:220)
	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.reflect.InvocationTargetException
	at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:376)
	at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:193)
	at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:278)
	at org.apache.hive.service.cli.operation.Operation.run(Operation.java:312)
	at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:517)
	at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:504)
	at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310)
	at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:509)
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1497)
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1482)
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
	at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.parse.SemanticException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.reflect.InvocationTargetException
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1272)
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$ReduceWorkVectorizationNodeProcessor.process(Vectorizer.java:1407)
	at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
	at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
	at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
	at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:56)
	at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeReduceWork(Vectorizer.java:1103)
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertReduceWork(Vectorizer.java:966)
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:490)
	at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
	at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
	at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1482)
	at org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:644)
	at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:275)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11488)
	at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:292)
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:263)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:483)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:358)
	at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1300)
	at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1272)
	at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:191)
	... 15 more
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException:java.lang.reflect.InvocationTargetException
	at org.apache.hadoop.hive.ql.exec.OperatorFactory.getVectorOperator(OperatorFactory.java:151)
	at org.apache.hadoop.hive.ql.exec.OperatorFactory.getVectorOperator(OperatorFactory.java:160)
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.vectorizeOperator(Vectorizer.java:2595)
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1264)
	... 38 more
Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException:null
	at sun.reflect.GeneratedConstructorAccessor161.newInstance(Unknown Source)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.hive.ql.exec.OperatorFactory.getVectorOperator(OperatorFactory.java:147)
	... 41 more
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException:The column KEY._col2:0._col0 is not in the vectorization context column map {KEY._col0=0, KEY._col1=1, KEY._col2=2}.
	at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:413)
	at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:511)
	at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:581)
	at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getAggregatorExpression(VectorizationContext.java:2842)
	at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.<init>(VectorGroupByOperator.java:773)
	... 45 more

set hive.vectorized.execution.enabled=false
DROP TABLE TEMP_01; 
CREATE TEMPORARY TABLE TEMP_01 AS (SELECT 'TEST' AS D1, 'BBB' AS M1); 
SELECT D1 , M1 , COUNT(DISTINCT D1) AS M2 , COUNT(DISTINCT M1) AS M3 
FROM TEMP_01 GROUP BY D1 , M1 
UNION ALL 
SELECT D1 , 'XX' AS M1 , COUNT(DISTINCT D1) AS M2 , COUNT(DISTINCT M1) AS M3 
FROM TEMP_01 GROUP BY D1 , M1

Works well.

Please explain why occurs such different situations.