Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Is it possible to create materialized view from a lateral view (UTDF) in hive ?

avatar
Contributor

Hi,

 

I want to create a materialized view with startdate or enddate split if there are on multiple day. The solution for do this is use lateral view, the select query is ok but when i try select with materialized view it's doesnt work. Has anyone tried this out or knows why it doesn't work ?

 

here my code :

create materialized view last_view
as
select
t.machine,
case when pe.i = 0
then t.startdate
else timestamp(date_add (t.startdate,pe.i))
end as startdate,
case when pe.i = datediff(t.enddate,t.startdate)
then t.enddate
else timestamp(date_add (t.startdate,pe.i+1))
end as enddate

from dwh_dev.factmachine2 t
lateral view posexplode(split(space(datediff(t.enddate,t.startdate)),' ')) pe as i,x ;

 

and the result:

java.sql.SQLException: Error while processing statement: FAILED: Hive Internal Error: org.apache.hadoop.hive.ql.metadata.HiveException(Error while invoking FailureHook. hooks: java.lang.NullPointerException

 

thanks in advances

6 REPLIES 6

avatar
Super Guru

@Ellyly  can you please post the complete stack trace from the HiveServer2 logs?

avatar
Contributor

Hi, here is the complete error :

INFO : Moving data to directory hdfs://lvdcluster/warehouse/tablespace/managed/hive/dwh_dev.db/last_view from hdfs://lvdcluster/warehouse/tablespace/managed/hive/dwh_dev.db/.hive-staging_hive_2020-02-13_09-58-05_548_760277875407322997-3/-ext-10001
INFO : Starting task [Stage-4:DDL] in serial mode
INFO : Starting task [Stage-3:STATS] in serial mode
INFO : Starting task [Stage-5:DDL] in serial mode
ERROR : FAILED: Hive Internal Error: org.apache.hadoop.hive.ql.metadata.HiveException(Error while invoking FailureHook. hooks: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.reexec.ReExecutionOverlayPlugin$LocalHook.run(ReExecutionOverlayPlugin.java:45)
at org.apache.hadoop.hive.ql.HookRunner.invokeGeneralHook(HookRunner.java:296)
at org.apache.hadoop.hive.ql.HookRunner.runFailureHooks(HookRunner.java:283)
at org.apache.hadoop.hive.ql.Driver.invokeFailureHooks(Driver.java:2664)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2434)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2055)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1747)
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226)
at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

avatar
Super Guru
@Ellyly ,

Please also share the exact version of either HDP or CDH in use, this can help to determine the code that we need to check and see why we hit NPE.

Cheers
Eric

avatar
Contributor

Hi,

 

@EricL@rajkumar_singh 

Here the versions that I use for HDP and Hive:

HDP version : HDP-3.1.4.0 (complete version 3.1.4.0-315)

Hive version : Hive 3.1.0

 

 

avatar
Super Guru

@Ellyly  the NPE you got is side effect of Exception you have in stage-3/4/5, can you please upload the Error stack you got before the NPE, I am assuming that you are running hive on tez?

I think ReExecOverlayPlugin failing with NPE and shallowing the exception at client-side (fix for this has been identified as Hive-19204 but you should be able to see the real cause in HS2 logs 

avatar
Contributor

Hi,

 

@rajkumar_singh 

thanks to your advice I found my file hiveserver2 Interactive.log and the error is in task 4-5. I noticed 2 things :

- First, unfortunately before the first NPE error in the file hiveserver2 Interactive.log it is another NPE that we find

- Secondly, in task 3 we have 

 - Cannot get a table snapshot for factmachine_mv

but I don't thinks that is the error's source.

 

HiveServeur2Interactive.log:

 

2020-02-17T15:11:13,525 INFO  [HiveServer2-Background-Pool: Thread-5750]: FileOperations (FSStatsAggregator.java:aggregateStats(101)) - Read stats for : dwh_dev.factmachine_mv/	numRows	1231370
2020-02-17T15:11:13,525 INFO  [HiveServer2-Background-Pool: Thread-5750]: FileOperations (FSStatsAggregator.java:aggregateStats(101)) - Read stats for : dwh_dev.factmachine_mv/	rawDataSize	552885130
2020-02-17T15:11:13,525 WARN  [HiveServer2-Background-Pool: Thread-5750]: metadata.Hive (Hive.java:alterTable(778)) - Cannot get a table snapshot for factmachine_mv
2020-02-17T15:11:13,609 INFO  [HiveServer2-Background-Pool: Thread-5750]: stats.BasicStatsTask (SessionState.java:printInfo(1227)) - Table dwh_dev.factmachine_mv stats: [numFiles=2, numRows=1231370, totalSize=1483151, rawDataSize=552885130]
2020-02-17T15:11:13,609 INFO  [HiveServer2-Background-Pool: Thread-5750]: stats.BasicStatsTask (BasicStatsTask.java:aggregateStats(271)) - Table dwh_dev.factmachine_mv stats: [numFiles=2, numRows=1231370, totalSize=1483151, rawDataSize=552885130]
2020-02-17T15:11:13,620 INFO  [HiveServer2-Background-Pool: Thread-5750]: mapred.FileInputFormat (FileInputFormat.java:listStatus(259)) - Total input files to process : 1
2020-02-17T15:11:14,881 INFO  [HiveServer2-Background-Pool: Thread-5750]: ql.Driver (Driver.java:launchTask(2710)) - Starting task [Stage-5:DDL] in serial mode
2020-02-17T15:11:14,902 INFO  [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (CalcitePlanner.java:genLogicalPlan(385)) - Starting generating logical plan
2020-02-17T15:11:14,904 INFO  [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:genResolvedParseTree(12232)) - Completed phase 1 of Semantic Analysis
2020-02-17T15:11:14,904 INFO  [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2113)) - Get metadata for source tables
2020-02-17T15:11:14,904 INFO  [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2244)) - Get metadata for subqueries
2020-02-17T15:11:14,905 INFO  [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2113)) - Get metadata for source tables
2020-02-17T15:11:14,935 INFO  [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2244)) - Get metadata for subqueries
2020-02-17T15:11:14,936 INFO  [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2268)) - Get metadata for destination tables
2020-02-17T15:11:14,936 INFO  [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(2268)) - Get metadata for destination tables
2020-02-17T15:11:14,944 INFO  [HiveServer2-Background-Pool: Thread-5750]: ql.Context (Context.java:getMRScratchDir(551)) - New scratch dir is hdfs://lvdcluster/tmp/hive/hive/c0988fe9-1a5e-4a6a-9ddd-7b782d0cdba1/hive_2020-02-17_15-11-14_897_3869104234201454658-21
2020-02-17T15:11:14,945 INFO  [HiveServer2-Background-Pool: Thread-5750]: parse.CalcitePlanner (SemanticAnalyzer.java:genResolvedParseTree(12237)) - Completed getting MetaData in Semantic Analysis
2020-02-17T15:11:14,945 INFO  [HiveServer2-Background-Pool: Thread-5750]: parse.BaseSemanticAnalyzer (CalcitePlanner.java:canCBOHandleAst(875)) - Not invoking CBO because the statement has lateral views
2020-02-17T15:11:14,946 ERROR [HiveServer2-Background-Pool: Thread-5750]: exec.TaskRunner (TaskRunner.java:runSequential(108)) - Error in executeTask
java.lang.NullPointerException: null
	at org.apache.calcite.plan.RelOptMaterialization.<init>(RelOptMaterialization.java:68) ~[calcite-core-1.16.0.3.1.4.0-315.jar:1.16.0.3.1.4.0-315]
	at org.apache.hadoop.hive.ql.metadata.HiveMaterializedViewsRegistry.addMaterializedView(HiveMaterializedViewsRegistry.java:235) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at org.apache.hadoop.hive.ql.metadata.HiveMaterializedViewsRegistry.createMaterializedView(HiveMaterializedViewsRegistry.java:187) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at org.apache.hadoop.hive.ql.exec.MaterializedViewTask.execute(MaterializedViewTask.java:59) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2712) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2383) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2055) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1747) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226) [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324) [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at java.security.AccessController.doPrivileged(Native Method) [?:1.8.0_112]
	at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_112]
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) [hadoop-common-3.1.1.3.1.4.0-315.jar:?]
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342) [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
2020-02-17T15:11:14,947 INFO  [HiveServer2-Background-Pool: Thread-5750]: reexec.ReOptimizePlugin (ReOptimizePlugin.java:run(70)) - ReOptimization: retryPossible: false
2020-02-17T15:11:14,948 ERROR [HiveServer2-Background-Pool: Thread-5750]: ql.Driver (SessionState.java:printError(1250)) - FAILED: Hive Internal Error: org.apache.hadoop.hive.ql.metadata.HiveException(Error while invoking FailureHook. hooks: java.lang.NullPointerException
	at org.apache.hadoop.hive.ql.reexec.ReExecutionOverlayPlugin$LocalHook.run(ReExecutionOverlayPlugin.java:45)
	at org.apache.hadoop.hive.ql.HookRunner.invokeGeneralHook(HookRunner.java:296)
	at org.apache.hadoop.hive.ql.HookRunner.runFailureHooks(HookRunner.java:283)
	at org.apache.hadoop.hive.ql.Driver.invokeFailureHooks(Driver.java:2664)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2434)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2055)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1747)
	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226)
	at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
)

 

For the moment, my solution is to use use a simple view. Because I noticed that even if the materilized view doesn't work, a simple view work with this query.