Created 11-14-2018 04:51 PM
Hi,
I am trying to capture the failures when I am writing to Hive table.
the scenario I am testing is, I want to capture the data when my Hive is down or my entire Hadoop cluster is down
I am writing retry and failures from Puthivestreaming to a local file system, I can see the files written to retry but not on failure.
looks like it never fails, I saw some suggestion to retry for 3/4 times and then treat that as a failure but in my case when Hive is down it should fail at first instance.
In another scenario I was trying for folder permission, I have removed the folder permission for the table Puthivestreaming is writing even in this case it reties but never fails.
when I redirect retry to Puthivestreaming itself can I configure to retry thrice and fail?
Please suggest me how to configure Puthivestreaming to fail
Regards,
~Sri
Created 11-15-2018 03:18 AM
You can probably use Retry loop in this case.
Loop:
Refer to this link for Retry loop implementation.
Created 11-15-2018 02:38 PM
Shu,
That is the workaround I am thinking about. My concern is when my Hive is down OR no permission to write in such cases even if you retry 100 times it is going to fail, I want to fail at first instance itself. Unless you fix the root cause Puthivestreaming will never be succeeded. My concern is is it a bug in Puthivestreaming Or I have not configured it properly.
Since Failure is not working in above scenarios I am stopping the processor with API call on first instance of Retry.
Thanks& Regards,
~Sri
Created on 11-15-2018 11:53 PM - edited 08-17-2019 05:06 PM
As per puthivestreaming documentation below:
Flowfile will be transferred to failure relationship if the record could not transmitted to hive.
Created 11-27-2018 04:04 PM
Shu,
To test above condition I brought down Hive and same time trying to ingest data using Puthivestreaming
It throws below errors in Nifi-app.log but in flowfile it never goes to failure or retry
2018-11-27 15:10:42,146 ERROR [Timer-Driven Process Thread-8] o.a.n.processors.hive.PutHiveStreaming PutHiveStreaming[id=80198e2c-18b2-3722-b3be-4d97c2b7cf6c] org.apache.nifi.processors.hive.PutHiveStreaming$Lambda$928/1889725558@38ef0670 failed to process due to org.apache.nifi.processor.exception.ProcessException: Error writing [org.apache.nifi.processors.hive.PutHiveStreaming$HiveStreamingRecord@2939c3df] to Hive Streaming transaction due to java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient; rolling back session: org.apache.nifi.processor.exception.ProcessException: Error writing [org.apache.nifi.processors.hive.PutHiveStreaming$HiveStreamingRecord@2939c3df] to Hive Streaming transaction due to java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient org.apache.nifi.processor.exception.ProcessException: Error writing [org.apache.nifi.processors.hive.PutHiveStreaming$HiveStreamingRecord@2939c3df] to Hive Streaming transaction due to java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onHiveRecordsError$1(PutHiveStreaming.java:640) at org.apache.nifi.processor.util.pattern.ExceptionHandler$OnError.lambda$andThen$0(ExceptionHandler.java:54) at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onHiveRecordError$2(PutHiveStreaming.java:647) at org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:148) at org.apache.nifi.processors.hive.PutHiveStreaming$1.process(PutHiveStreaming.java:838) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2207) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2175) at org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:791) at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$4(PutHiveStreaming.java:657) at org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114) at org.apache.nifi.processor.util.pattern.RollbackOnFailure.onTrigger(RollbackOnFailure.java:184) at org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:657) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1147) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:91) at org.apache.hive.hcatalog.common.HiveClientCache.getNonCachedHiveMetastoreClient(HiveClientCache.java:85) at org.apache.hive.hcatalog.common.HCatUtil.getHiveMetastoreClient(HCatUtil.java:546) at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.getMetaStoreClient(HiveEndPoint.java:448) at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:274) at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:243) at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnectionImpl(HiveEndPoint.java:180) at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:157)
Now I am handling failure and retry for puthivestreaming. I want to kill puthivesreaming as soon as it reaches failure/retry but it not reaching there
Regards,
~Sri