Member since
05-27-2016
15
Posts
4
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
692 | 03-28-2017 05:55 PM |
09-06-2017
08:04 PM
We ended up installing a Kafka service in HDF where SAM resides to resolve this issue.
... View more
09-06-2017
07:59 PM
1 Kudo
This worked nicely. Thanks Yash!
... View more
08-29-2017
12:12 AM
1 Kudo
I have a text file I'm reading into a Nifi flow, which consists of key value pairs that look like the following: status:"400" body_bytes_sent:"174" referer:"google.com" user_agent:"safari" host:"8.8.4.4" query_string:"devices" status:"400" body_bytes_sent:"172" referer:"yahoo.com" user_agent:"Chrome" host:"8.8.4.3" query_string:"books" Currently the tailfile processor is successfully reading these files as they are created and append to. However, I want to output them as avro files to Kafka. Any idea what processor(s) I need to convert these text files into avro format in my flow? What would the configuration look like for these processors?
... View more
Labels:
- Labels:
-
Apache NiFi
08-24-2017
07:15 PM
I have a SAM instance which is connecting to a remote Kafka instance. It was added manually to the Environment and Service pool in SAM, before using it in a SAM application. In my application I can drag it to the Canvas, but when I double click on it to configure, it throws an error on screen "An exception with message [com.hortonworks.streamline.streams.catalog.exception.ServiceConfigurationNotFoundException: Configuration [kafka-env] not found for service [KAFKA] in cluster with id [5]] was thrown while processing request." A more detailed error message from the logs: p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px 'Andale Mono'; color: #28fe14; background-color: #000000; background-color: rgba(0, 0, 0, 0.9)}
span.s1 {font-variant-ligatures: no-common-ligatures} ERROR[14:34:31.032] [dw-352 - GET /api/v1/catalog/streams/componentbundles/SOURCE/1/hints/namespaces/1] c.h.s.s.s.TopologyComponentBundleResource -Got exception: [RuntimeException] / message [com.hortonworks.streamline.streams.catalog.exception.ServiceConfigurationNotFoundException: Configuration [kafka-env] not found for service [KAFKA] in cluster with id [5]] / related resource location: [com.hortonworks.streamline.streams.service.TopologyComponentBundleResource.getFieldHints](TopologyComponentBundleResource.java:539) java.lang.RuntimeException: com.hortonworks.streamline.streams.catalog.exception.ServiceConfigurationNotFoundException: Configuration [kafka-env] not found for service [KAFKA] in cluster with id [5] at com.hortonworks.streamline.streams.cluster.bundle.AbstractKafkaBundleHintProvider.getSecurity(AbstractKafkaBundleHintProvider.java:37) at com.hortonworks.streamline.streams.cluster.bundle.AbstractSecureBundleHintProvider.provide(AbstractSecureBundleHintProvider.java:32) at com.hortonworks.streamline.streams.service.TopologyComponentBundleResource.getFieldHints(TopologyComponentBundleResource.java:539) . . .
... View more
Labels:
- Labels:
-
Apache Kafka
04-18-2017
02:26 PM
Does anybody have a workaround for this issue? Restarting the nifi processor manually is not feasible for a flow that should run "lights out".
... View more
03-28-2017
05:55 PM
1 Kudo
Answering my own question. This was happening because of the load balancer between Nifi and splunk. Hitting splunk directly resolved the issue.
... View more
03-28-2017
03:21 PM
1 Kudo
Is it possible that complex splunk queries are too much for the getsplunk processor? These queries are a few hundred lines long, but run fine in the splunk GUI. Error is below, and recommendations on how to troublehshoot? 10:11:06 ESTERRORf46c3d86-5571-146c-a8ef-071da5f520e6
denatb3wlwbl08.cloud.myco.org:8443GetSplunk[id=f46c3d86-5571-146c-a8ef-071da5f520e6] Failed to process session due to org.apache.nifi.processor.exception.ProcessException: IOException thrown from GetSplunk[id=f46c3d86-5571-146c-a8ef-071da5f520e6]: java.net.SocketException: Connection reset: org.apache.nifi.processor.exception.ProcessException: IOException thrown from GetSplunk[id=f46c3d86-5571-146c-a8ef-071da5f520e6]: java.net.SocketException: Connection reset
... View more
Labels:
- Labels:
-
Apache NiFi
03-08-2017
04:52 PM
Having issues with the generatetablefetch and QueryDatabaseTable processors. They are not returning any data and I see the error "Failed to retrieve observed maximum values from the State Manager" in the Nifi Processor Error tooltip. If I use the executeSQL processor with the same DB connection pool against the same table, everything works fine. This makes me feel it's not a connectivity issue to the Database. I see the following error in my nifi-app.log: 11:42:09,908 ERROR [Timer-Driven Process Thread-5] o.a.n.p.standard.GenerateTableFetch GenerateTableFetch[id=aec91163-015a-1000-ffff-fffffd91354d] Failed to retrieve observed maximum values from the State Manager. Will not perform query until this is accomplished.
2017-03-08 11:42:09,908 ERROR [Timer-Driven Process Thread-5] o.a.n.p.standard.GenerateTableFetch
java.io.IOException: Failed to obtain value from ZooKeeper for component with ID aec91163-015a-1000-ffff-fffffd91354d
at org.apache.nifi.controller.state.providers.zookeeper.ZooKeeperStateProvider.getState(ZooKeeperStateProvider.java:423) ~[nifi-framework-core-1.1.1.jar:1.1.1]
at org.apache.nifi.controller.state.StandardStateManager.getState(StandardStateManager.java:63) ~[nifi-framework-core-1.1.1.jar:1.1.1]
at org.apache.nifi.processors.standard.GenerateTableFetch.onTrigger(GenerateTableFetch.java:137) ~[nifi-standard-processors-1.1.1.jar:1.1.1]
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) [nifi-framework-core-1.1.1.jar:1.1.1]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-1.1.1.jar:1.1.1]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-1.1.1.jar:1.1.1]
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) [nifi-framework-core-1.1.1.jar:1.1.1]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_121]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_121]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_121]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Caused by: java.net.UnknownHostException: >denatb3wlwbl07.cloud.my-company.org
at java.net.InetAddress.getAllByName0(InetAddress.java:1280) ~[na:1.8.0_121]
at java.net.InetAddress.getAllByName(InetAddress.java:1192) ~[na:1.8.0_121]
at java.net.InetAddress.getAllByName(InetAddress.java:1126) ~[na:1.8.0_121]
at org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.nifi.controller.state.providers.zookeeper.ZooKeeperStateProvider.getZooKeeper(ZooKeeperStateProvider.java:170) ~[nifi-framework-core-1.1.1.jar:1.1.1]
at org.apache.nifi.controller.state.providers.zookeeper.ZooKeeperStateProvider.getState(ZooKeeperStateProvider.java:403) ~[nifi-framework-core-1.1.1.jar:1.1.1]
... 13 common frames omitted
... View more
Labels:
- Labels:
-
Apache NiFi
09-28-2016
06:47 PM
I'm attempting to write a parquet file to an S3 bucket, but getting the below error: py4j.protocol.Py4JJavaError: An error occurred while calling o36.parquet.
: java.lang.NoSuchMethodError: com.amazonaws.services.s3.transfer.TransferManager.<init>(Lcom/amazonaws/services/s3/AmazonS3;Ljava/util/concurrent/ThreadPoolExecutor;)V
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:287)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:453)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:211)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:194)
at org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:488)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:745) The line of python code that fails is: df.write.parquet("s3a://myfolder/myotherfolder") The same line line of code works successfully if I write it to HDFS instead of S3: df.write.parquet("hdfs://myfolder/myotherfolder") I'm using spark-2.0.2-bin-hadoop2.7 and aws-java-sdk-1.11.38 binaries. Right now I'm running it interactively in PyCharm on my Mac.
... View more
Labels:
- Labels:
-
Apache Spark
08-15-2016
02:12 PM
I have an NFS read only Share that contains thousands of unique files (all with unique names) which I want to process with Nifi. New files are constantly added to this share by another process. Is there any way to keep track of which files Nifi already processes in a prior run, so I don't process them again? I cannot make any changes to files on files on this NFS share (cannot delete or rename them).
... View more
Labels:
- Labels:
-
Apache NiFi
08-01-2016
09:20 PM
I'm building a nifi flow with the nifi GUI. As part of the flow I have series of flat files I'm ingesting, which contains lines that I don't want in my data flow. These lines all start with the hash/pound symbol #. Any ideas how to filter these lines out? I was thinking a routeoncontent processor, but I'm not sure how to make it filter out lines.
... View more
Labels:
- Labels:
-
Apache NiFi