Member since
04-05-2016
130
Posts
92
Kudos Received
29
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2177 | 06-05-2018 01:00 AM | |
3094 | 04-10-2018 08:23 AM | |
3959 | 07-18-2017 02:16 AM | |
1474 | 07-11-2017 01:02 PM | |
1907 | 07-10-2017 02:10 AM |
01-22-2019
12:13 AM
1 Kudo
Hi @Yasir Khokhar EnforceOrder processor is not designed to order FlowFiles using epoch. It's designed to enforce order using a continuous numbers, such as sequence. Skipping numbers such as epoch cannot be used EnforceOrder. I think you can achieve what you expect with PriorityAttributePrioritizer instead. Please check NiFi docs on prioritizers. https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#prioritization
... View more
07-05-2018
11:37 PM
Hi @jgw.gooijer Is there any relationship between 'the flow of a month' and the file names of FlowFiles those are listed by FTP and waiting at the Wait processor? If you can derive year and month from a waiting listed file name, by using Attribute Expression Language, then you can configure Wait/Notify processors to use 'year-month' value as signal identifier. This way, you don't have to close identifiers. Keep them open after 'the flow of a month' finished, so that any number of FlowFiles having the same 'year-month' can go through Wait processor.
... View more
06-20-2018
12:01 PM
I don't have much experience with Apache Drill, but looking at the source code, it looks like it fails at instantiating the class specified with 'hadoop.security.group.mapping' parameter. The parameter is usually defined at a Hadoop configuration file, 'core-default.xml'. Are you aware of any configuration file specifying 'hadoop.security.group.mapping' parameter potentially being null?
... View more
06-20-2018
07:56 AM
I don't have a Drill environment to test with, but downloaded Drill JDBC driver and tested if I can load its Driver class with NiFi DBCPConnectionPool. Since drill driver needs other dependency jars, I needed to use the jar which contains all dependencies. Please try drill-jdbc-all-1.13.0.jar instead. apache-drill-1.13.0/jars/jdbc-driver/drill-jdbc-all-1.13.0.jar
... View more
06-20-2018
06:53 AM
@Tommy The error message has been changed slightly, it seems the jar file is used now but failed with different reason. Would you please share the detail of the error? I think nifi-app.log has a stack-trace for the exception, which may be different than the one you posted before.
... View more
06-20-2018
05:29 AM
@Tommy Please make sure that the jar file path is correct. If you are using a multi-node NiFi cluster, each node has to have the jar file at the specified path.
... View more
06-15-2018
01:27 AM
Hi @Tommy In the screenshot, I see you set /var/lib/dlfs/lib directory at 'Database Driver Location(s)'. How about setting the jar file path instead?
... View more
06-13-2018
02:51 AM
Hi @Sam Rubin I took a look on the shell output and found -Xms and -Xmx are being configured with quite small amount: ++ /bin/java -cp '/apps/nifi/nifi-1.5.0/conf:/apps/nifi/nifi-1.5.0/lib/bootstrap/*' -Xms12m -Xmx24m -Dorg.apache.nifi.bootstrap.config.log.dir=/apps/nifi/nifi-1.5.0/logs -Dorg.apache.nifi.bootstrap.config
.pid.dir=/apps/nifi/nifi-1.5.0/run -Dorg.apache.nifi.bootstrap.config.file=/apps/nifi/nifi-1.5.0/conf/bootstrap.conf org.apache.nifi.bootstrap.RunNiFi start It may not be the direct cause of having start/shutdown loop, but NiFi wouldn't run with 24m heap size. Is this configuration intentional?? Please check bootstrap.conf and increase -Xms and -Xmx to 512m at least.
... View more
06-09-2018
07:18 AM
Thanks @Sam Rubin for the log. I couldn't find anything unusual in the logs. By looking at NiFi source code, the "Failed to send shutdown command to port ... Will kill ..." message is only logged by calling RunNiFi.stop() method. And the method is only called from RunNiFi.main() which is called by nifi.sh. Would you run the nifi.sh with debug option? Such as: sh -bx ./bin/nifi.sh start Please share std output of above command. Then we can look how the nifi.sh works in detail. Just in case, did you modify nifi.sh?
... View more
06-08-2018
12:29 AM
Hi @Sam Rubin, I see the WARN message regularly when NiFi restarts if target remote NiFi process group is not running, but haven't encountered start/shutdown loop because of it. The [RemoteProcessGroup47a43fe6-2c9b-39aa-4fae-5851f048d58bThread-1] that logged the WARN message is a background thread and periodically (per 60 secs) checks target RPG status, and does not affect the main thread. Is it possible for you to share entire nifi-app.log and nifi-bootstrap.log? I suspect there is other issues preventing NiFi to startup correctly.
... View more
06-05-2018
01:00 AM
1 Kudo
Hi @Óscar Andreu, thank you for reporting the issue. I was able to reproduce the issue. By looking at the code, there is a bug in how JoltTransformJSON validates the spec when it contains EL. The processor tries to check if the JoltSpec contains any custom transformation class to fail fast, but it fails to get a JoltSpec because EL can not return a valid spec at the validation phase. I submitted an Apache JIRA for this and will try to fix it. https://issues.apache.org/jira/browse/NIFI-5268 Thanks!
... View more
04-20-2018
01:35 AM
Hello @Praveen Patel I wonder if there is any 'auto-terminated' relationship somewhere. NiFi does not drop FlowFiles unless it is told to do so. Would you be able to share your flow as a flow template xml file so that I can replicate and investigate the issue?
... View more
04-10-2018
08:23 AM
Hi @Benjamin Bouret Thank you very much for reporting the issue. It was a bug with HTTP S2S transport protocol. It can not send data more than 2GB at once. I filed Apache NiFi JIRA and a patch for that. https://issues.apache.org/jira/browse/NIFI-5065 As a work-around, please use RAW S2S transport protocol instead, it can send large files without issue.
... View more
04-10-2018
07:25 AM
UPDATES Excuse me, the previous diagnose was wrong. I was trying to reproduce the issue by tweaking timeout settings, however, it turned out the issue is not caused by the timeout setting, instead, there's some issue around how HTTP S2S transport transfers data. I got following exception when I tried to send a 8GB file with HTTP S2S: 2018-04-10 16:05:45,006 ERROR [I/O dispatcher 25] o.a.n.r.util.SiteToSiteRestApiClient Failed to send data to http://HW13076.local:8080/nifi-api/data-transfer/input-ports/ad9a3887-0162-1000-e312-dee642179
c9c/transactions/608f1ce4-56da-4899-9348-d2864e364d40/flow-files due to java.lang.RuntimeException: Sending data to http://HW13076.local:8080/nifi-api/data-transfer/input-ports/ad9a3887-0162-1000-e312-dee
642179c9c/transactions/608f1ce4-56da-4899-9348-d2864e364d40/flow-files has reached to its end, but produced : read : wrote byte sizes (659704502 : 659704502 : 9249639094) were not equal. Something went wr
ong.
java.lang.RuntimeException: Sending data to http://HW13076.local:8080/nifi-api/data-transfer/input-ports/ad9a3887-0162-1000-e312-dee642179c9c/transactions/608f1ce4-56da-4899-9348-d2864e364d40/flow-files h
as reached to its end, but produced : read : wrote byte sizes (659704502 : 659704502 : 9249639094) were not equal. Something went wrong.
at org.apache.nifi.remote.util.SiteToSiteRestApiClient$4.produceContent(SiteToSiteRestApiClient.java:848)
at org.apache.http.impl.nio.client.MainClientExec.produceContent(MainClientExec.java:262)
at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.produceContent(DefaultClientExchangeHandlerImpl.java:140)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.outputReady(HttpAsyncRequestExecutor.java:241)
at org.apache.http.impl.nio.DefaultNHttpClientConnection.produceOutput(DefaultNHttpClientConnection.java:290)
at org.apache.http.impl.nio.client.InternalIODispatch.onOutputReady(InternalIODispatch.java:86)
at org.apache.http.impl.nio.client.InternalIODispatch.onOutputReady(InternalIODispatch.java:39)
at org.apache.http.impl.nio.reactor.AbstractIODispatch.outputReady(AbstractIODispatch.java:145)
at org.apache.http.impl.nio.reactor.BaseIOReactor.writable(BaseIOReactor.java:188)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:341)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
at java.lang.Thread.run(Thread.java:745) 2018-04-10 16:06:25,009 ERROR [Timer-Driven Process Thread-3] o.a.nifi.remote.StandardRemoteGroupPort RemoteGroupPort[name=input,targets=http://localhost:8080/nifi] failed to communicate with remote NiFi instance due to java.io.IOException: Failed to confirm transaction with Peer[url=http://HW13076.local:8080/nifi-api] due to java.io.IOException: Awaiting transferDataLatch has been timeout.
2018-04-10 16:06:25,009 ERROR [Timer-Driven Process Thread-3] o.a.nifi.remote.StandardRemoteGroupPort
java.io.IOException: Failed to confirm transaction with Peer[url=http://HW13076.local:8080/nifi-api] due to java.io.IOException: Awaiting transferDataLatch has been timeout.
at org.apache.nifi.remote.AbstractTransaction.confirm(AbstractTransaction.java:264)
at org.apache.nifi.remote.StandardRemoteGroupPort.transferFlowFiles(StandardRemoteGroupPort.java:369)
at org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:285)
at org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Awaiting transferDataLatch has been timeout.
at org.apache.nifi.remote.util.SiteToSiteRestApiClient.finishTransferFlowFiles(SiteToSiteRestApiClient.java:938)
at org.apache.nifi.remote.protocol.http.HttpClientTransaction.readTransactionResponse(HttpClientTransaction.java:93)
at org.apache.nifi.remote.AbstractTransaction.confirm(AbstractTransaction.java:239)
... 12 common frames omitted I will continue investigating the cause. I tested sending the same file with RAW S2S and worked just fine. Please use RAW transport protocol if possible.
... View more
04-10-2018
03:10 AM
Hi @Benjamin Bouret Thanks for reporting this. NiFi Site-to-Site client implements different kind of timeout and expiration settings, such as cache expiration, idle connection expiration, penalization period, batch duration, and timeout. The error you shared can occur if a S2S client waited more than 'idle connection expiration'. The problem is, 'idle connection expiration' is not configurable by NiFi user at the moment. So, if data transferring takes more than the default 30 seconds, it will fail with the reported message. Even if longer 'Communication Timeout' is set at the Remote Process Group configuration. From the error message you shared, I assume you are using HTTP transport protocol for S2S. I wonder if using RAW can be a work around. But by looking at the NiFi code, it may not be the case though.. because RAW uses the 'idle connection expiration' to shutdown existing sockets, too. Split/Merge pattern will not work as you found S2S clients distribute FlowFiles among nodes in the target cluster. I think a possible work around is using other ListenXXXX processors (e.g. ListenHTTP or ListenTCP) at the target NiFi cluster. Then send data using corresponding processors such as PostHttp or PutTCP ... etc. This way, you can control how to distribute the segmented FlowFiles to target nodes. You need to do manually pick a target hostname for load balancing. It can be done with NiFi Expression Language and certain set of processors. Please refer this template: https://gist.github.com/ijokarumawak/077d7fdca57b9c8ff386f28c5198efd1 I will raise Apache NiFi JIRA so that 'idle connection expiration' can be set based on the 'Communication Timeout' value. In the meantime, I hope the above workaround works for you.
... View more
01-17-2018
06:21 AM
@Sugi Narayana I encountered the same issue with HDP 2.6.4, Kerberized Kafka with SimpleAclAuthorizer, and addressed the issue by referring this thread. I used following commands to give an user required privileges for producer and consumer: # Added to publish
./bin/kafka-acls.sh --authorizer-properties zookeeper.connect=zk-host:2181 --topic topic-name --producer --add --allow-principal User:UserName
# Added to consume
./bin/kafka-acls.sh --authorizer-properties zookeeper.connect=zk-host:2181 --topic topic-name --consumer --group group-name --add --allow-principal User:UserName Hope this helps.
... View more
01-17-2018
01:09 AM
Hi @Luis Size Would you share how did you build NiFi? I used following steps and the ReportLineageToAtlas works without NoClassDefFoundError. # Download NiFi source code
wget http://ftp.riken.jp/net/apache/nifi/1.5.0/nifi-1.5.0-source-release.zip
# Unzip it and build with Atlas enabled
mvn clean install -T 2.0C -Pinclude-atlas -DskipTests BTW, a directory should be set to 'Atlas Configuration Directory' instead of atlas-application.properties, although I doubt this could be the cause of NoClassDefFoundError.
... View more
07-18-2017
02:16 AM
2 Kudos
Hi @Gabriel Queiroz, If you'd like to use ID FlowFile attribute from DetectDuplicate processor's 'Cache Entry Identifier', you need to use NiFi Attribute Expression Language syntax. Currently you have configured it as '$ID', but you need it as '${ID}' (wrap it with a curly bracket).
... View more
07-11-2017
01:02 PM
Hi @Pavan Challa If I understand your use case correctly, I think I have come up with a groovy script to do the job. It loops through dataFlow elements, test if filePattern matches, then resolve path with ExpressionLanguage. Please check this Gist if it works for you: https://gist.github.com/ijokarumawak/a4ef40b49b45cecf3c43b56493683725 I had to change filePattern to be Regular Expression <filePattern>salary_*.gz</filePattern>
/* Added a dot before the star */
<filePattern>salary_.*.gz</filePattern> Hope this helps.
... View more
07-11-2017
12:25 PM
Great, thank you very much!
... View more
07-11-2017
12:09 PM
Excuse me @Pavan Challa, I should have looked at the related question more carefully. So, what you'd like to do is looping through 'dataFlow' elements to find one which has 'filePattern' that matches with the name of incoming file? If so, that might be too much to do with XMLFileLookupService. I'd write a script with ExecuteScript that parses the XML file and do the matching.
... View more
07-11-2017
05:29 AM
1 Kudo
Hello @Eric Lloyd It seems you're using NiFi 1.3.0, if so, GrokReader might be helpful to extract whole stacktrace. Actually GrokReader doc has an example which reads java stacktrace, please check. https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.3.0/org.apache.nifi.grok.GrokReader/additionalDetails.html
... View more
07-11-2017
05:23 AM
Hello @Pavan Challa Probably SplitXml processor will be helpful. Specify depth '2' and you'll get FlowFiles having only single 'dataFlow' element as its content.
... View more
07-11-2017
05:09 AM
Unfortunately, I'm not aware of any existing client app that can deserialize what NiFi state manager stores. But since ZookeeperStateProvider.deserialize method source code is available and it's not that complicated, you can write a simple app that connects to Zk, get value from Znode and deserialize it.
... View more
07-10-2017
11:47 PM
1 Kudo
Hi @Gabriel Queiroz In order to submit a JIRA, please go to the login page and sign-up your account. Then you'll see a red 'Create' button. https://issues.apache.org/jira/login.jsp
... View more
07-10-2017
02:10 AM
Hello @Rohit Ravishankar ListFile uses last modified timestamp of files. It tracks the latest last modified timestamp to pick newly modified files since it ran before. So, if a file is added with older last modified timestamp than the one which ListFile already picked, then the file won't be picked with ListFile logic. There is an existing JIRA to discuss about the behavior [1]. Min/Max filter is used to filter-out files that is too-young (min) or too-old (max) files. Even if a file passed these condition if its last modified timestamp is older than the latest on already listed, it won't be picked. If your use-case requires processing input files in 'descending last modified timestamp ' order, then I'd recommend using GetFile (keepSourceFile = false) and PriorityAttributePrioritizer [2] combination. [1] https://issues.apache.org/jira/browse/NIFI-2383 [2] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#prioritization I hope this helps.
... View more
07-10-2017
01:47 AM
3 Kudos
Hello @Gabriel Queiroz I'm surprised to know that there's no existing processor that removes a key from distributed map cache. Would you submit a JIRA issue to request that functionality if possible? In the mean while, if you encounter such shortcomings, you can address it by writing a custom processor with ExecuteScript or InvokeScriptedProcessor in most cases. Those processors let you write custom processor using your favorite scripting engine. I've written an example, using Groovy to remove a key from distributed map cache. It will work with your use-case I think. https://gist.github.com/ijokarumawak/14d560fec5a052b3a157b38a11955772
... View more
07-07-2017
08:43 AM
1 Kudo
Hello @M R I was able to reproduce this behavior. The reason is timestamp-millis logicalType is not used as expected. When PutDatabaseRecord executes SQL, 'UPDATED_DATE' was set as it is as Long representation, so Oracle complains about it. Oracle expects Date type. Debugging further, I found that Avro doesn't read logicalType information if the type is defined as a String JSON node. Your schema text is defined as follows: {
"name": "UPDATED_DATE",
"type": "long",
"logicalType": "timestamp-millis"
} This way, 'logicalType' exists in the Field object, not for the 'type'. Since the 'type' element is textual, Avro parser don't decorate it. It has to be: {
"name": "UPDATED_DATE",
"type": {
"type": "long",
"logicalType": "timestamp-millis"
}
} To correctly annotate type with logicalType. Now, the 'type' element is an JSON object, and Avro parser uses 'logicalType' definition. Then it works as expected.
... View more
07-05-2017
07:30 AM
Hi @Paul Yang I am not aware of any specific reason to not fix a bug. Unfortunately the original effort to fix the issue has been left inactive until now. I've picked it up again and proposed a fix against the latest Apache NiFi codebase. Hopefully it can be merged soon. https://github.com/apache/nifi/pull/1976 Thanks again for reporting this issue!
... View more
07-05-2017
12:59 AM
1 Kudo
Hi, If the node can not be recovered, then you need to remove that node from the cluster in order to continue operations without the node. You can do so from the menu (at the right top) -> cluster.
... View more