Member since
02-20-2017
12
Posts
0
Kudos Received
0
Solutions
01-23-2019
10:04 PM
@Shu , I am trying to upload the above template but I am getting below error Error : "Found bundle org.apache.nifi:nifi-update-attribute-nar:1.6.0 but does not support org.apache.nifi.processors.attributes.UpdateAttribute" Could you please confirm if we need nifi-update-attribute-nar nar file? In my requirement, I am joining 5 tables to retrieve incremental data based record_create_date every second data is populated on these tables, I need to retrieve the data incrementally and flowfile should remember the last record_create_date it successfully pulled. in the above example if I query e.joindate >'${stored.state}' and e.joindate >'${current.state}' (it has current time), it will never fetch new records, right? For distributed cache it is asking for Server Hostname and port, what should be the server for this? Where I am setting the last fetched date (joindate) to ${stored.state} Could you please clarify me on my doubt? Thanks, ~Sri
... View more
11-27-2018
04:04 PM
Shu, To test above condition I brought down Hive and same time trying to ingest data using Puthivestreaming It throws below errors in Nifi-app.log but in flowfile it never goes to failure or retry 2018-11-27 15:10:42,146 ERROR [Timer-Driven Process Thread-8] o.a.n.processors.hive.PutHiveStreaming PutHiveStreaming[id=80198e2c-18b2-3722-b3be-4d97c2b7cf6c] org.apache.nifi.processors.hive.PutHiveStreaming$Lambda$928/1889725558@38ef0670 failed to process due to org.apache.nifi.processor.exception.ProcessException: Error writing [org.apache.nifi.processors.hive.PutHiveStreaming$HiveStreamingRecord@2939c3df] to Hive Streaming transaction due to java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient; rolling back session: org.apache.nifi.processor.exception.ProcessException: Error writing [org.apache.nifi.processors.hive.PutHiveStreaming$HiveStreamingRecord@2939c3df] to Hive Streaming transaction due to java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient org.apache.nifi.processor.exception.ProcessException: Error writing [org.apache.nifi.processors.hive.PutHiveStreaming$HiveStreamingRecord@2939c3df] to Hive Streaming transaction due to java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onHiveRecordsError$1(PutHiveStreaming.java:640) at org.apache.nifi.processor.util.pattern.ExceptionHandler$OnError.lambda$andThen$0(ExceptionHandler.java:54) at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onHiveRecordError$2(PutHiveStreaming.java:647) at org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:148) at org.apache.nifi.processors.hive.PutHiveStreaming$1.process(PutHiveStreaming.java:838) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2207) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2175) at org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:791) at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$4(PutHiveStreaming.java:657) at org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114) at org.apache.nifi.processor.util.pattern.RollbackOnFailure.onTrigger(RollbackOnFailure.java:184) at org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:657) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1147) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:91) at org.apache.hive.hcatalog.common.HiveClientCache.getNonCachedHiveMetastoreClient(HiveClientCache.java:85) at org.apache.hive.hcatalog.common.HCatUtil.getHiveMetastoreClient(HCatUtil.java:546) at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.getMetaStoreClient(HiveEndPoint.java:448) at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:274) at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:243) at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnectionImpl(HiveEndPoint.java:180) at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:157) Now I am handling failure and retry for puthivestreaming. I want to kill puthivesreaming as soon as it reaches failure/retry but it not reaching there Regards, ~Sri
... View more
11-15-2018
02:38 PM
Shu, That is the workaround I am thinking about. My concern is when my Hive is down OR no permission to write in such cases even if you retry 100 times it is going to fail, I want to fail at first instance itself. Unless you fix the root cause Puthivestreaming will never be succeeded. My concern is is it a bug in Puthivestreaming Or I have not configured it properly. Since Failure is not working in above scenarios I am stopping the processor with API call on first instance of Retry. Thanks& Regards, ~Sri
... View more
11-14-2018
04:51 PM
Hi, I am trying to capture the failures when I am writing to Hive table. the scenario I am testing is, I want to capture the data when my Hive is down or my entire Hadoop cluster is down I am writing retry and failures from Puthivestreaming to a local file system, I can see the files written to retry but not on failure. looks like it never fails, I saw some suggestion to retry for 3/4 times and then treat that as a failure but in my case when Hive is down it should fail at first instance. In another scenario I was trying for folder permission, I have removed the folder permission for the table Puthivestreaming is writing even in this case it reties but never fails. when I redirect retry to Puthivestreaming itself can I configure to retry thrice and fail? Please suggest me how to configure Puthivestreaming to fail Regards, ~Sri
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache NiFi
08-17-2018
08:26 PM
Abhinav, I am seeing the same error, How did you fix this?
... View more
04-10-2017
06:24 PM
Is there any solution for this issue? I recently upgraded my Hadoop Ambari 2.2.0.0. to 2.4.2.0 and HDP-2.3.4.0-3485 to 2.5.0.0 My Ambari server and agents are updared with latest version and version matches Both Hosts and Version information are not displayed
... View more
04-10-2017
06:22 PM
Is there any solution for this issue? I recently upgraded my Hadoop Ambari 2.2.0.0. to 2.4.2.0 and HDP-2.3.4.0-3485 to 2.5.0.0 Both Hosts and Version information are not displayed
... View more
02-20-2017
03:42 PM
Thank you Michael for your reply. I am not using HDFS for storing my Solr Indexes. When i am indexing to Solr in parallel I am also archiving those files to HDFS, I was mentioning writing to HDFS has no issues. I have also tried committing frequently and also at the end of the process but I see not much data indexed into Solr
... View more
02-20-2017
02:43 PM
And I do not have any issue when I ingest a large file
... View more
02-20-2017
02:17 PM
Hi, I have large set of small files , each file is around 7 – 10
K in size Total I have 350K files with around 6 GB. I have changed my flume configuration with many options but
whatever the config change Solr takes 2 sec for each file to ingest agent.sources = SpoolDirSrc agent.channels = FileChannel agent.sinks = SolrSink # Configure Source agent.sources.SpoolDirSrc.channels = fileChannel agent.sources.SpoolDirSrc.type = spooldir agent.sources.SpoolDirSrc.spoolDir = /app/home/solr/final agent.sources.SpoolDirSrc.basenameHeader = true #agent.sources.SpoolDirSrc.batchSize = 100000 agent.sources.SpoolDirSrc.fileHeader = true agent.sources.SpoolDirSrc.deserializer =
org.apache.flume.sink.solr.morphline.BlobDeserializer$Builder # Use a channel that buffers events in memory agent.channels.FileChannel.type = file agent.channels.FileChannel.capacity = 1000 agent.channels.FileChannel.transactionCapacity = 1000 #agent.channels.FileChannel.transactionCapacity = 10000 # Configure Solr Sink agent.sinks.SolrSink.type =
org.apache.flume.sink.solr.morphline.MorphlineSolrSink agent.sinks.SolrSink.morphlineFile =
/etc/flume/conf/morphline.conf #agent.sinks.SolrSink.batchsize = 100000 #agent.sinks.SolrSink.batchDurationMillis = 5000 agent.sinks.SolrSink.channel = fileChannel agent.sinks.SolrSink.morphlineId = morphline1 agent.sinks.SolrSink.tika.config = tikaConfig.xml agent.sinks.SolrSink.rollCount = 0 agent.sinks.SolrSink.rollInterval = 0 agent.sinks.SolrSink.rollsize = 100000000 agent.sinks.SolrSink.idleTimeout = 0 agent.sinks.SolrSink.batchSize = 100000 agent.sinks.SolrSink.txnEventMax = 10000000 agent.sources.SpoolDirSrc.channels = FileChannel agent.sinks.SolrSink.channel = FileChannel My Collection is on 2 shards and 1 replication I do not have an issues ingesting data to HDFS but having issues with Solr only Kindly let me know how do I make this better Regards, ~Sri
... View more
Labels:
- Labels:
-
Apache Flume
-
Apache Solr