- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Is it possible to create materialized view from a lateral view (UTDF) in hive ?
- Labels:
-
Apache Hive
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I want to create a materialized view with startdate or enddate split if there are on multiple day. The solution for do this is use lateral view, the select query is ok but when i try select with materialized view it's doesnt work. Has anyone tried this out or knows why it doesn't work ?
here my code :
create materialized view last_view
as
select
t.machine,
case when pe.i = 0
then t.startdate
else timestamp(date_add (t.startdate,pe.i))
end as startdate,
case when pe.i = datediff(t.enddate,t.startdate)
then t.enddate
else timestamp(date_add (t.startdate,pe.i+1))
end as enddate
from dwh_dev.factmachine2 t
lateral view posexplode(split(space(datediff(t.enddate,t.startdate)),' ')) pe as i,x ;
and the result:
java.sql.SQLException: Error while processing statement: FAILED: Hive Internal Error: org.apache.hadoop.hive.ql.metadata.HiveException(Error while invoking FailureHook. hooks: java.lang.NullPointerException
thanks in advances
Created 02-12-2020 12:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Ellyly can you please post the complete stack trace from the HiveServer2 logs?
Created 02-13-2020 01:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, here is the complete error :
INFO : Moving data to directory hdfs://lvdcluster/warehouse/tablespace/managed/hive/dwh_dev.db/last_view from hdfs://lvdcluster/warehouse/tablespace/managed/hive/dwh_dev.db/.hive-staging_hive_2020-02-13_09-58-05_548_760277875407322997-3/-ext-10001
INFO : Starting task [Stage-4:DDL] in serial mode
INFO : Starting task [Stage-3:STATS] in serial mode
INFO : Starting task [Stage-5:DDL] in serial mode
ERROR : FAILED: Hive Internal Error: org.apache.hadoop.hive.ql.metadata.HiveException(Error while invoking FailureHook. hooks: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.reexec.ReExecutionOverlayPlugin$LocalHook.run(ReExecutionOverlayPlugin.java:45)
at org.apache.hadoop.hive.ql.HookRunner.invokeGeneralHook(HookRunner.java:296)
at org.apache.hadoop.hive.ql.HookRunner.runFailureHooks(HookRunner.java:283)
at org.apache.hadoop.hive.ql.Driver.invokeFailureHooks(Driver.java:2664)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2434)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2055)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1747)
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226)
at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Created 02-13-2020 06:08 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please also share the exact version of either HDP or CDH in use, this can help to determine the code that we need to check and see why we hit NPE.
Cheers
Eric
Created 02-14-2020 12:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Here the versions that I use for HDP and Hive:
HDP version : HDP-3.1.4.0 (complete version 3.1.4.0-315)
Hive version : Hive 3.1.0
Created on 02-14-2020 09:02 AM - edited 02-14-2020 09:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Ellyly the NPE you got is side effect of Exception you have in stage-3/4/5, can you please upload the Error stack you got before the NPE, I am assuming that you are running hive on tez?
I think ReExecOverlayPlugin failing with NPE and shallowing the exception at client-side (fix for this has been identified as Hive-19204 but you should be able to see the real cause in HS2 logs
Created 02-17-2020 08:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
thanks to your advice I found my file hiveserver2 Interactive.log and the error is in task 4-5. I noticed 2 things :
- First, unfortunately before the first NPE error in the file hiveserver2 Interactive.log it is another NPE that we find
- Secondly, in task 3 we have
- Cannot get a table snapshot for factmachine_mv
but I don't thinks that is the error's source.
HiveServeur2Interactive.log:
2020-02-17T15:11:13,525 INFO [HiveServer2-Background-Pool: Thread-5750]: FileOperations (FSStatsAggregator.java:aggregateStats(101)) - Read stats for : dwh_dev.factmachine_mv/ numRows 1231370 2020-02-17T15:11:13,525 INFO [HiveServer2-Background-Pool: Thread-5750]: FileOperations (FSStatsAggregator.java:aggregateStats(101)) - Read stats for : dwh_dev.factmachine_mv/ rawDataSize 552885130 2020-02-17T15:11:13,525 WARN [HiveServer2-Background-Pool: Thread-5750]: metadata.Hive (Hive.java:alterTable(778)) - Cannot get a table snapshot for factmachine_mv 2020-02-17T15:11:13,609 INFO [HiveServer2-Background-Pool: Thread-5750]: stats.BasicStatsTask (SessionState.java:printInfo(1227)) - Table dwh_dev.factmachine_mv stats: [numFiles=2, numRows=1231370, totalSize=1483151, rawDataSize=552885130] 2020-02-17T15:11:13,609 INFO [HiveServer2-Background-Pool: Thread-5750]: stats.BasicStatsTask (BasicStatsTask.java:aggregateStats(271)) - Table dwh_dev.factmachine_mv stats: [numFiles=2, numRows=1231370, totalSize=1483151, rawDataSize=552885130] 2020-02-17T15:11:13,620 INFO [HiveServer2-Background-Pool: Thread-5750]: mapred.FileInputFormat (FileInputFormat.java:listStatus(259)) - Total input files to process : 1 2020-02-17T15:11:14,881 INFO [HiveServer2-Background-Pool: Thread-5750]: ql.Driver (Driver.java:launchTask(2710)) - Starting task [Stage-5:DDL] in serial mode 2020-02-17T15:11:14,902 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (CalcitePlanner.java:genLogicalPlan(385)) - Starting generating logical plan 2020-02-17T15:11:14,904 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:genResolvedParseTree(12232)) - Completed phase 1 of Semantic Analysis 2020-02-17T15:11:14,904 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2113)) - Get metadata for source tables 2020-02-17T15:11:14,904 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2244)) - Get metadata for subqueries 2020-02-17T15:11:14,905 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2113)) - Get metadata for source tables 2020-02-17T15:11:14,935 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2244)) - Get metadata for subqueries 2020-02-17T15:11:14,936 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2268)) - Get metadata for destination tables 2020-02-17T15:11:14,936 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2268)) - Get metadata for destination tables 2020-02-17T15:11:14,944 INFO [HiveServer2-Background-Pool: Thread-5750]: ql.Context (Context.java:getMRScratchDir(551)) - New scratch dir is hdfs://lvdcluster/tmp/hive/hive/c0988fe9-1a5e-4a6a-9ddd-7b782d0cdba1/hive_2020-02-17_15-11-14_897_3869104234201454658-21 2020-02-17T15:11:14,945 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:genResolvedParseTree(12237)) - Completed getting MetaData in Semantic Analysis 2020-02-17T15:11:14,945 INFO [HiveServer2-Background-Pool: Thread-5750]: parse.BaseSemanticAnalyzer (CalcitePlanner.java:canCBOHandleAst(875)) - Not invoking CBO because the statement has lateral views 2020-02-17T15:11:14,946 ERROR [HiveServer2-Background-Pool: Thread-5750]: exec.TaskRunner (TaskRunner.java:runSequential(108)) - Error in executeTask java.lang.NullPointerException: null at org.apache.calcite.plan.RelOptMaterialization.<init>(RelOptMaterialization.java:68) ~[calcite-core-1.16.0.3.1.4.0-315.jar:1.16.0.3.1.4.0-315] at org.apache.hadoop.hive.ql.metadata.HiveMaterializedViewsRegistry.addMaterializedView(HiveMaterializedViewsRegistry.java:235) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hadoop.hive.ql.metadata.HiveMaterializedViewsRegistry.createMaterializedView(HiveMaterializedViewsRegistry.java:187) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hadoop.hive.ql.exec.MaterializedViewTask.execute(MaterializedViewTask.java:59) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2712) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2383) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2055) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1747) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226) [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324) [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at java.security.AccessController.doPrivileged(Native Method) [?:1.8.0_112] at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_112] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) [hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342) [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112] 2020-02-17T15:11:14,947 INFO [HiveServer2-Background-Pool: Thread-5750]: reexec.ReOptimizePlugin (ReOptimizePlugin.java:run(70)) - ReOptimization: retryPossible: false 2020-02-17T15:11:14,948 ERROR [HiveServer2-Background-Pool: Thread-5750]: ql.Driver (SessionState.java:printError(1250)) - FAILED: Hive Internal Error: org.apache.hadoop.hive.ql.metadata.HiveException(Error while invoking FailureHook. hooks: java.lang.NullPointerException at org.apache.hadoop.hive.ql.reexec.ReExecutionOverlayPlugin$LocalHook.run(ReExecutionOverlayPlugin.java:45) at org.apache.hadoop.hive.ql.HookRunner.invokeGeneralHook(HookRunner.java:296) at org.apache.hadoop.hive.ql.HookRunner.runFailureHooks(HookRunner.java:283) at org.apache.hadoop.hive.ql.Driver.invokeFailureHooks(Driver.java:2664) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2434) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2055) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1747) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226) at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) )
For the moment, my solution is to use use a simple view. Because I noticed that even if the materilized view doesn't work, a simple view work with this query.