Support Questions
Find answers, ask questions, and share your expertise

NiFi ConvertJSONToAvro is not routing errors to failure relationship

Super Collaborator

Hi,

I am trying to take our daily HDFS logs from Ranger and convert those into AVRO and creating a HIVE table on top of it for reporting. I used InferAvroSchema to produce AVRO schema and was able to convert all most all of the old logs except a few which are failing with below messages.

sometimes with this error

2017-08-08 10:25:27,983 WARN [Timer-Driven Process Thread-7] o.a.n.c.t.ContinuallyRunProcessorTask java.lang.RuntimeException: Unexpected end-of-input in VALUE_STRING at [Source: org.apache.nifi.controller.repository.io.FlowFileAccessInputStream@2c191ce8; line: 2871, column: 2864] at com.fasterxml.jackson.databind.MappingIterator.next(MappingIterator.java:196) ~[jackson-databind-2.6.1.jar:2.6.1] at org.kitesdk.shaded.com.google.common.collect.Iterators$8.next(Iterators.java:811) ~[na:na] at org.kitesdk.data.spi.filesystem.JSONFileReader.next(JSONFileReader.java:121) ~[na:na] at org.apache.nifi.processors.kite.ConvertJSONToAvro$1.process(ConvertJSONToAvro.java:148) ~[na:na] at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2578) ~[nifi-framework-core-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.processors.kite.ConvertJSONToAvro.onTrigger(ConvertJSONToAvro.java:139) ~[na:na] at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) ~[nifi-api-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) ~[nifi-framework-core-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) [nifi-framework-core-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_112] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_112] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_112] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_112] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_112] Caused by: com.fasterxml.jackson.core.JsonParseException: Unexpected end-of-input in VALUE_STRING at [Source: org.apache.nifi.controller.repository.io.FlowFileAccessInputStream@2c191ce8; line: 2871, column: 2864] at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1581) ~[jackson-core-2.6.1.jar:2.6.1] at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:533) ~[jackson-core-2.6.1.jar:2.6.1] at com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:470) ~[jackson-core-2.6.1.jar:2.6.1] at com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:466) ~[jackson-core-2.6.1.jar:2.6.1] at com.fasterxml.jackson.core.base.ParserBase.loadMoreGuaranteed(ParserBase.java:459) ~[jackson-core-2.6.1.jar:2.6.1] at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishAndReturnString(UTF8StreamJsonParser.java:2389) ~[jackson-core-2.6.1.jar:2.6.1] at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:285) ~[jackson-core-2.6.1.jar:2.6.1] at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:233) ~[jackson-databind-2.6.1.jar:2.6.1] at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:69) ~[jackson-databind-2.6.1.jar:2.6.1] at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:15) ~[jackson-databind-2.6.1.jar:2.6.1] at com.fasterxml.jackson.databind.MappingIterator.nextValue(MappingIterator.java:277) ~[jackson-databind-2.6.1.jar:2.6.1] at com.fasterxml.jackson.databind.MappingIterator.next(MappingIterator.java:192) ~[jackson-databind-2.6.1.jar:2.6.1] ... 17 common frames omitted

and sometimes with this ERROR

2017-08-08 10:01:19,391 ERROR [Timer-Driven Process Thread-2] o.a.n.processors.kite.ConvertJSONToAvro ConvertJSONToAvro[id=0995e03c-40f5-4156-a065-4cda05b4efa1] ConvertJSONToAvro[id=0995e03c-40f5-4156-a065-4cda05b4efa1] failed to process due to java.lang.RuntimeException: Unexpected end-of-input in field name at [Source: org.apache.nifi.controller.repository.io.FlowFileAccessInputStream@ccb3a65; line: 440072, column: 2258]; rolling back session: java.lang.RuntimeException: Unexpected end-of-input in field name at [Source: org.apache.nifi.controller.repository.io.FlowFileAccessInputStream@ccb3a65; line: 440072, column: 2258] 2017-08-08 10:01:19,392 ERROR [Timer-Driven Process Thread-2] o.a.n.processors.kite.ConvertJSONToAvro java.lang.RuntimeException: Unexpected end-of-input in field name at [Source: org.apache.nifi.controller.repository.io.FlowFileAccessInputStream@ccb3a65; line: 440072, column: 2258] at com.fasterxml.jackson.databind.MappingIterator.next(MappingIterator.java:196) ~[jackson-databind-2.6.1.jar:2.6.1] at org.kitesdk.shaded.com.google.common.collect.Iterators$8.next(Iterators.java:811) ~[kite-data-core-1.0.0.jar:na] at org.kitesdk.data.spi.filesystem.JSONFileReader.next(JSONFileReader.java:121) ~[kite-data-core-1.0.0.jar:na] at org.apache.nifi.processors.kite.ConvertJSONToAvro$1.process(ConvertJSONToAvro.java:148) ~[nifi-kite-processors-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2578) ~[nifi-framework-core-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.processors.kite.ConvertJSONToAvro.onTrigger(ConvertJSONToAvro.java:139) ~[nifi-kite-processors-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) ~[nifi-api-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) [nifi-framework-core-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) [nifi-framework-core-1.1.0.2.1.0.0-165.jar:1.1.0.2.1.0.0-165] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_112] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_112] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_112] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_112] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_112] Caused by: com.fasterxml.jackson.core.JsonParseException: Unexpected end-of-input in field name at [Source: org.apache.nifi.controller.repository.io.FlowFileAccessInputStream@ccb3a65; line: 440072, column: 2258] at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1581) ~[jackson-core-2.6.1.jar:2.6.1] at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:533) ~[jackson-core-2.6.1.jar:2.6.1] at com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:470) ~[jackson-core-2.6.1.jar:2.6.1] at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.parseEscapedName(UTF8StreamJsonParser.java:1966) ~[jackson-core-2.6.1.jar:2.6.1] at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.slowParseName(UTF8StreamJsonParser.java:1867) ~[jackson-core-2.6.1.jar:2.6.1] at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._parseName(UTF8StreamJsonParser.java:1651) ~[jackson-core-2.6.1.jar:2.6.1] at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextFieldName(UTF8StreamJsonParser.java:1007) ~[jackson-core-2.6.1.jar:2.6.1] at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:219) ~[jackson-databind-2.6.1.jar:2.6.1] at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:69) ~[jackson-databind-2.6.1.jar:2.6.1] at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:15) ~[jackson-databind-2.6.1.jar:2.6.1] at com.fasterxml.jackson.databind.MappingIterator.nextValue(MappingIterator.java:277) ~[jackson-databind-2.6.1.jar:2.6.1] at com.fasterxml.jackson.databind.MappingIterator.next(MappingIterator.java:192) ~[jackson-databind-2.6.1.jar:2.6.1] ... 17 common frames omitted

So it looks like those files are not in correct format , i was expecting NiFi to move those files on to Failure relationship so that we can examine those or do something else. But it still keep them in the queue. Which make it to try and process these files again and again..How to solve this.?

23504-converttoavro.jpg

2 REPLIES 2

It looks like an unexpected exception is happening and since the processor isn't catching it, the framework rolls back the session which puts the flow file being processed back into the original queue that it was taken from. Obviously this isn't ideal for this case and it would be preferable to improve the processor to handle this and route to failure.

In the mean time, you can stop the convert processor and right click on the queue and perform a listing and from the listing you should be able to download the contents of the flow files if you want to inspect them or perform analysis. You can also right-click and clear the queue to get the problem flow files out of there.

If you are able to upgrade to NiFi 1.3.0 then you may have better results with the ConvertRecord processor which is not based on the Kite library.

Super Collaborator

Yes, ( right-click , download or clear ) is what i am doing currently. But we cannot have this manual dependency in production env right.? thats why i was checking to see if there is any better way.

We are planning to migrate to HDF 3.0 soon , hopefully it works.