Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

hive vectorization union all problem

hive vectorization union all problem

New Contributor

I am running a map only query like "select name from personel where name='aaa'" on a orc+zlib table on the cluster and if I check the EXPLAIN results I can see execution mode is "Vectorized".

If I run a simple map only union all query like "select name from personel where name='aaa' union all select name from personel where name='bbb'" then vectorization is not active for this query in the EXPLAIN results and query duration gets longer.

What might be the reason this result and what is the relation between union all and vectorization?

2 REPLIES 2
Highlighted

Re: hive vectorization union all problem

@Ozhan Gulen

There are certain known issues with respect to union all and tez vectorization. Could you share the execution plan for union all for more insight on the issue?

Highlighted

Re: hive vectorization union all problem

New Contributor

I'm facing same problem.

HDP 2.6.4 and Hive 2.1 LLAP

set hive.vectorized.execution.enabled=true
DROP TABLE TEMP_01; 
CREATE TEMPORARY TABLE TEMP_01 AS (SELECT 'TEST' AS D1, 'BBB' AS M1); 
SELECT D1 , M1 , COUNT(DISTINCT D1) AS M2 , COUNT(DISTINCT M1) AS M3 
FROM TEMP_01 GROUP BY D1 , M1 
UNION ALL 
SELECT D1 , 'XX' AS M1 , COUNT(DISTINCT D1) AS M2 , COUNT(DISTINCT M1) AS M3 
FROM TEMP_01 GROUP BY D1 , M1

Hive 2.1 LLAP raise exception below as

   java.lang.Exception: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.reflect.InvocationTargetException
java.lang.Exception: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.reflect.InvocationTargetException
	at org.apache.ambari.view.hive20.resources.jobs.JobService.getOne(JobService.java:147)
	at sun.reflect.GeneratedMethodAccessor552.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
	at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
	at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
	at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
	at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
	at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
	at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
	at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
	at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
	at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
	at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1507)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
	at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:118)
	at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:84)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.apache.ambari.server.security.authorization.AmbariAuthorizationFilter.doFilter(AmbariAuthorizationFilter.java:287)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:113)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:103)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:113)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:54)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:45)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.apache.ambari.server.security.authentication.AmbariDelegatingAuthenticationFilter.doFilter(AmbariDelegatingAuthenticationFilter.java:132)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.apache.ambari.server.security.authorization.AmbariUserAuthorizationFilter.doFilter(AmbariUserAuthorizationFilter.java:91)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:87)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:192)
	at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:160)
	at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:237)
	at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:167)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.apache.ambari.server.api.MethodOverrideFilter.doFilter(MethodOverrideFilter.java:72)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.apache.ambari.server.api.AmbariPersistFilter.doFilter(AmbariPersistFilter.java:47)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.apache.ambari.server.view.AmbariViewsMDCLoggingFilter.doFilter(AmbariViewsMDCLoggingFilter.java:54)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.apache.ambari.server.view.ViewThrottleFilter.doFilter(ViewThrottleFilter.java:161)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.apache.ambari.server.security.AbstractSecurityHeaderFilter.doFilter(AbstractSecurityHeaderFilter.java:125)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.apache.ambari.server.security.AbstractSecurityHeaderFilter.doFilter(AbstractSecurityHeaderFilter.java:125)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82)
	at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:294)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:427)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
	at org.apache.ambari.server.controller.AmbariHandlerList.processHandlers(AmbariHandlerList.java:212)
	at org.apache.ambari.server.controller.AmbariHandlerList.processHandlers(AmbariHandlerList.java:201)
	at org.apache.ambari.server.controller.AmbariHandlerList.handle(AmbariHandlerList.java:150)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
	at org.eclipse.jetty.server.Server.handle(Server.java:370)
	at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
	at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:973)
	at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1035)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:641)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:231)
	at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.reflect.InvocationTargetException
	at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:264)
	at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:250)
	at org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:309)
	at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:250)
	at org.apache.ambari.view.hive20.HiveJdbcConnectionDelegate.execute(HiveJdbcConnectionDelegate.java:49)
	at org.apache.ambari.view.hive20.actor.StatementExecutor.runStatement(StatementExecutor.java:91)
	at org.apache.ambari.view.hive20.actor.StatementExecutor.handleMessage(StatementExecutor.java:72)
	at org.apache.ambari.view.hive20.actor.HiveActor.onReceive(HiveActor.java:38)
	at akka.actor.UntypedActor$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
	at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
	at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
	at akka.actor.ActorCell.invoke(ActorCell.scala:487)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
	at akka.dispatch.Mailbox.run(Mailbox.scala:220)
	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.reflect.InvocationTargetException
	at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:376)
	at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:193)
	at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:278)
	at org.apache.hive.service.cli.operation.Operation.run(Operation.java:312)
	at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:517)
	at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:504)
	at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310)
	at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:509)
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1497)
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1482)
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
	at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.parse.SemanticException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.reflect.InvocationTargetException
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1272)
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$ReduceWorkVectorizationNodeProcessor.process(Vectorizer.java:1407)
	at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
	at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
	at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
	at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:56)
	at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeReduceWork(Vectorizer.java:1103)
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertReduceWork(Vectorizer.java:966)
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:490)
	at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
	at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
	at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1482)
	at org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:644)
	at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:275)
	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11488)
	at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:292)
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:263)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:483)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:358)
	at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1300)
	at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1272)
	at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:191)
	... 15 more
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException:java.lang.reflect.InvocationTargetException
	at org.apache.hadoop.hive.ql.exec.OperatorFactory.getVectorOperator(OperatorFactory.java:151)
	at org.apache.hadoop.hive.ql.exec.OperatorFactory.getVectorOperator(OperatorFactory.java:160)
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.vectorizeOperator(Vectorizer.java:2595)
	at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1264)
	... 38 more
Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException:null
	at sun.reflect.GeneratedConstructorAccessor161.newInstance(Unknown Source)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.hive.ql.exec.OperatorFactory.getVectorOperator(OperatorFactory.java:147)
	... 41 more
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException:The column KEY._col2:0._col0 is not in the vectorization context column map {KEY._col0=0, KEY._col1=1, KEY._col2=2}.
	at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:413)
	at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:511)
	at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:581)
	at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getAggregatorExpression(VectorizationContext.java:2842)
	at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.<init>(VectorGroupByOperator.java:773)
	... 45 more

set hive.vectorized.execution.enabled=false
DROP TABLE TEMP_01; 
CREATE TEMPORARY TABLE TEMP_01 AS (SELECT 'TEST' AS D1, 'BBB' AS M1); 
SELECT D1 , M1 , COUNT(DISTINCT D1) AS M2 , COUNT(DISTINCT M1) AS M3 
FROM TEMP_01 GROUP BY D1 , M1 
UNION ALL 
SELECT D1 , 'XX' AS M1 , COUNT(DISTINCT D1) AS M2 , COUNT(DISTINCT M1) AS M3 
FROM TEMP_01 GROUP BY D1 , M1

Works well.

Please explain why occurs such different situations.

Don't have an account?
Coming from Hortonworks? Activate your account here