Created 10-09-2016 02:30 PM
I use convertjsontoavro + puthivestreaming processor:
json
{ "name": "张三", "num": "2", "score": "3.4", "newtime": "2016-03-01 10:10:10" }
avro schema
{ "name" : "newsInfo", "type" : "record", "fields" : [{"name" : "name", "type" : "string"}, {"name" : "num", "type" : "int"}, {"name" : "score", "type" : "double"}, {"name" : "newtime", "type" : "string"}] }
but got error:
2016-10-09 10:28:05,503 WARN [put-hive-streaming-0] org.apache.hive.hcatalog.data.JsonSerDe Error [org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric, can not use numeric value accessors
at [Source: java.io.ByteArrayInputStream@3c18563c; line: 1, column: 28]] parsing json text [{"name": "张三", "num": "2", "score": "3.4", "newtime": "2016-03-01 10:10:10"}].
2016-10-09 10:28:05,503 ERROR [Timer-Driven Process Thread-5] o.a.n.processors.hive.PutHiveStreaming PutHiveStreaming[id=d50d1499-3137-1226-89c0-86dfeac7bf2c] Error writing record to Hive Streaming transaction
2016-10-09 10:28:05,505 ERROR [Timer-Driven Process Thread-5] o.a.n.processors.hive.PutHiveStreaming
org.apache.hive.hcatalog.streaming.SerializationError: {metaStoreUri='thrift://hive1.wdp:9083', database='newsinfo', table='test1', partitionVals=[] } SerializationError
at org.apache.nifi.util.hive.HiveWriter.write(HiveWriter.java:119) ~[nifi-hive-processors-1.0.0.jar:1.0.0]
at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$4(PutHiveStreaming.java:480) ~[nifi-hive-processors-1.0.0.jar:1.0.0]
at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1880) ~[na:na]
at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1851) ~[na:na]
at org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:394) ~[nifi-hive-processors-1.0.0.jar:1.0.0]
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) ~[nifi-api-1.0.0.jar:1.0.0]
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1064) ~[na:na]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) ~[na:na]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) ~[na:na]
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) ~[na:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_101]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[na:1.8.0_101]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_101]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[na:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_101]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_101]
Caused by: org.apache.hive.hcatalog.streaming.SerializationError: Unable to convert byte[] record into Object
at org.apache.hive.hcatalog.streaming.StrictJsonWriter.encode(StrictJsonWriter.java:117) ~[hive-hcatalog-streaming-1.2.1.jar:1.2.1]
at org.apache.hive.hcatalog.streaming.StrictJsonWriter.write(StrictJsonWriter.java:78) ~[hive-hcatalog-streaming-1.2.1.jar:1.2.1]
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.write(HiveEndPoint.java:632) ~[hive-hcatalog-streaming-1.2.1.jar:1.2.1]
at org.apache.nifi.util.hive.HiveWriter$1.call(HiveWriter.java:113) ~[nifi-hive-processors-1.0.0.jar:1.0.0]
at org.apache.nifi.util.hive.HiveWriter$1.call(HiveWriter.java:110) ~[nifi-hive-processors-1.0.0.jar:1.0.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_101]
... 3 common frames omitted
Caused by: org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric, can not use numeric value accessors
at [Source: java.io.ByteArrayInputStream@3c18563c; line: 1, column: 28]
at org.apache.hive.hcatalog.data.JsonSerDe.deserialize(JsonSerDe.java:179) ~[hive-hcatalog-core-1.2.1.jar:1.2.1]
at org.apache.hive.hcatalog.streaming.StrictJsonWriter.encode(StrictJsonWriter.java:115) ~[hive-hcatalog-streaming-1.2.1.jar:1.2.1]
... 8 common frames omitted
Caused by: org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric, can not use numeric value accessors
at [Source: java.io.ByteArrayInputStream@3c18563c; line: 1, column: 28]
at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1433) ~[jackson-core-asl-1.9.13.jar:1.9.13]
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:521) ~[jackson-core-asl-1.9.13.jar:1.9.13]
at org.codehaus.jackson.impl.JsonParserBase._parseNumericValue(JsonParserBase.java:766) ~[jackson-core-asl-1.9.13.jar:1.9.13]
at org.codehaus.jackson.impl.JsonParserBase.getIntValue(JsonParserBase.java:622) ~[jackson-core-asl-1.9.13.jar:1.9.13]
at org.apache.hive.hcatalog.data.JsonSerDe.extractCurrentField(JsonSerDe.java:279) ~[hive-hcatalog-core-1.2.1.jar:1.2.1]
at org.apache.hive.hcatalog.data.JsonSerDe.populateRecord(JsonSerDe.java:218) ~[hive-hcatalog-core-1.2.1.jar:1.2.1]
at org.apache.hive.hcatalog.data.JsonSerDe.deserialize(JsonSerDe.java:174) ~[hive-hcatalog-core-1.2.1.jar:1.2.1]
... 9 common frames omitted
Created 01-27-2017 11:57 PM
Avro in HDF is 1.7.7 and timestamp was only introduced in Avro 1.8.x. I would suggest to treat the timestamp field as string.
Created 10-18-2016 08:50 PM
What did you use to create the avro schema. Not sure if you knew, but Nifi provides an InverAvroSchema processor, which may help define the type.
Created 01-27-2017 02:55 PM
@bhagan : Is timestamp a supported avro datatype?
Created 01-27-2017 03:12 PM
Arun, yes, timestamp is a supported datatype. Here is a link to the doc:
Created 01-30-2017 04:33 PM
@bhagan : Sorry, I was looking at if timestamp data type could be used in NiFi AVRO related processor. Unfortunately not, with NiFi 1.1.1 (the one I am using) because Avro is still 1.7 and not yet updated to 1.8 there. I believe this is what @Artem Ervits meant by his answer.
Created 01-27-2017 11:57 PM
Avro in HDF is 1.7.7 and timestamp was only introduced in Avro 1.8.x. I would suggest to treat the timestamp field as string.
Created 02-01-2017 03:25 AM
@boyer if that answers your question, please accept it as best.