Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

hive druid mv incremental rebuild not work

Highlighted

Re: hive druid mv incremental rebuild not work

Mentor

@anvanna  @jitendra03 

That's obvious and by design, an MV rebuild will take time depending on the changes in the underlying tables the MV is built on even incremental as it has to check each table to compare whether there was a change or not. Something else also you should look at tuning the memory remember behind the scenes it's Map Reduce working and you won't expect it to run in seconds with all the different map and reduces steps.

 

You should have a good understanding of the MV rebuild mechanism below and take that into consideration.

Hive supports incremental view maintenance, i.e., only refresh data that was affected by the changes in the original source tables. Incremental view maintenance will decrease the rebuild step execution time. In addition, it will preserve LLAP cache for existing data in the materialized view.

 

By default, Hive will attempt to rebuild a materialized view incrementally, falling back to full rebuild if it is not possible. The current implementation only supports incremental rebuild when there were INSERT operations over the source tables, while UPDATE and DELETE operations will force a full rebuild of the materialized view.

To execute incremental maintenance, the following conditions should be met:

  • The materialized view should only use transactional tables, either micromanaged or ACID.
  • If the materialized view definition contains a Group By clause, the materialized view should be stored in an ACID table, since it needs to support MERGE operation. For materialized view definitions consisting of Scan-Project-Filter-Join, this restriction does not exist.

A rebuild operation acquires an exclusive write lock over the materialized view, i.e., for a given materialized view, only one rebuild operation can be executed at a given time.

 

I would think you should trigger the rebuild during off-peak hours like when you have no users logged on and running queries in your DW

I would think you should trigger the rebuild during off-peak hours like when you have no users logged on and running queries in your DWH

@jitendra03   I think the initial problem of the MV rebuild failure has been achieved now we are talking of a performance problem, Since performance is a new issue unrelated to the question asked in the original subject, I kindly ask that you start a new question for performance-specific help I would be happy to help.  Asking multiple questions in the same thread makes a thread harder to follow for other users of this community. 


Hope that gives you a better picture of the behind the scenes tasks

 

Happy hadooping

Re: hive druid mv incremental rebuild not work

Cloudera Employee

Hello,

 

I was working on Materialized views handled by Druid, and I get the following error, I am using the 3.1 version, 

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Transaction failed do to exception being thrown from within the callback. See cause for the original exception.) (state=08S01,code=1)
Aborting command set because "force" is false and command failed: "CREATE MATERIALIZED VIEW ${db_name}.mv_druid
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
AS
SELECT *
FROM ${db_name}.src_1

 

Any updates on this type of error.

HS2 logs says "error committing transaction to druid metadata storage"

Re: hive druid mv incremental rebuild not work

Mentor

@jitendra03 

 

Can you share your hivemetastore.log?  out of curiosity can you share the failure or success of the below create MV statement

 

CREATE MATERIALIZED VIEW test_druid_mv
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
AS
SELECT *
FROM db_name.table

 

HTH

 

Re: hive druid mv incremental rebuild not work

Cloudera Employee

Hi @Shelton 

 

Heres the error and the logs from HS2. wanted to tell you, I could create an MV when there was no data in the source table, this error is when I am trying to rebuild the MV after loading the data into source table, or creating an MV from a source table with data in it.

CREATE MATERIALIZED VIEW default.druid_mv
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
AS
SELECT *
FROM default.src_1;


error:-

ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Transaction failed do to exception being thrown from within the callback. See cause for the original exception.)
INFO : Completed executing command(queryId=hive_20191210154028_98660f0d-1022-4a0e-bc0c-bab09617d3d0); Time taken: 25.969 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Transaction failed do to exception being thrown from within the callback. See cause for the original exception.) (state=08S01,code=1)


Logs: -

2019-12-10T15:40:54,601 INFO [HiveServer2-Background-Pool: Thread-14305]: reexec.ReOptimizePlugin (:()) - ReOptimization: retryPossible: false
2019-12-10T15:40:54,602 ERROR [HiveServer2-Background-Pool: Thread-14305]: ql.Driver (:()) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Transaction failed do to exception being thrown from within the callback. See cause for the original exception.)
2019-12-10T15:40:54,602 INFO [HiveServer2-Background-Pool: Thread-14305]: ql.Driver (:()) - Completed executing command(queryId=hive_20191210154028_98660f0d-1022-4a0e-bc0c-bab09617d3d0); Time taken: 25.969 seconds
2019-12-10T15:40:54,602 INFO [HiveServer2-Background-Pool: Thread-14305]: lockmgr.DbTxnManager (:()) - Stopped heartbeat for query: hive_20191210154028_98660f0d-1022-4a0e-bc0c-bab09617d3d0
2019-12-10T15:40:54,632 ERROR [HiveServer2-Background-Pool: Thread-14305]: operation.Operation (:()) - Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Transaction failed do to exception being thrown from within the callback. See cause for the original exception.)
at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:348) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:228) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_112]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_112]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_112]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_112]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Transaction failed do to exception being thrown from within the callback. See cause for the original exception.)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1103) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1108) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.exec.DDLTask.createView(DDLTask.java:5065) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:433) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2712) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2383) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2055) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1747) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
... 13 more
Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Transaction failed do to exception being thrown from within the callback. See cause for the original exception.
at org.apache.hadoop.hive.druid.DruidStorageHandler.commitInsertTable(DruidStorageHandler.java:930) ~[hive-druid-handler-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.druid.DruidStorageHandler.commitCreateTable(DruidStorageHandler.java:274) ~[hive-druid-handler-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:917) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:900) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at com.sun.proxy.$Proxy59.createTable(Unknown Source) ~[?:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:3001) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at com.sun.proxy.$Proxy59.createTable(Unknown Source) ~[?:?]
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1092) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1108) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.exec.DDLTask.createView(DDLTask.java:5065) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:433) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2712) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2383) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2055) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1747) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
... 13 more

 

 

Thank you

Re: hive druid mv incremental rebuild not work

Explorer

Do you connect via 

HIVESERVER2 JDBC URL or HIVESERVER2 INTERACTIVE JDBC URL?

Re: hive druid mv incremental rebuild not work

Cloudera Employee

Hi @Shelton

I connected via 

Metadata storage connector URL from Druid, which I got from Ambari.

 

Thank you

Re: hive druid mv incremental rebuild not work

Mentor

@jitendra03 

Hive Interactive (LLAP) needs to be installed in order to interact with Druid, and this is memory intensive.

There are a couple of configs to check like LLAP sizing and setup information about the Hive LLAP installation this is a very detailed document from Cloudera by@sergey 

In particular, I recommend you read thoroughly Druid and Hive integration  which really documents a walkthrough and benchmarking so a deep dive would give you a very good starting point.
You will need RAM

 

 

Re: hive druid mv incremental rebuild not work

Cloudera Employee

@Shelton 

 

I did try on LLAP and I had the following error, the thing is anything that's backed/ handled  by druid, I get the same following error.

 

Error: java.io.IOException: org.apache.hive.druid.io.druid.java.util.common.RE: Failure getting results for query[TimeseriesQuery{dataSource='default.druid_test', querySegmentSpec=LegacySegmentSpec{intervals=[1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z]}, descending=false, virtualColumns=[], dimFilter=null, granularity='AllGranularity', aggregatorSpecs=[CountAggregatorFactory{name='$f0'}], postAggregatorSpecs=[], context={queryId=hive_20191211113348_a1f4d6da-4559-46bc-8ba6-d4a6dc250cea, skipEmptyBuckets=false}}] from locations[[Ljava.lang.String;@4d855cea] because of [java.util.concurrent.ExecutionException: org.apache.hive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected] (state=,code=0)

Re: hive druid mv incremental rebuild not work

Explorer

@jitendra03 

Could u share  hiveserver2Interactive.log?  Did ur druid historical server installed?

Re: hive druid mv incremental rebuild not work

Mentor

@jitendra03 

Hi,

it's always a good idea to start a new thread and maybe tag me because the response in this thread are possible solutions to question asked by @anvanna  which could be different from the issue you are facing.

Please start a new thread tag me and remember to attach any logs screenshots and the versions 

Happy hadooping

 

Don't have an account?
Coming from Hortonworks? Activate your account here