- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
how to set timestamp data type in avro schema
- Labels:
-
Apache Hive
-
Apache NiFi
Created 10-09-2016 02:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I use convertjsontoavro + puthivestreaming processor:
json
{ "name": "张三", "num": "2", "score": "3.4", "newtime": "2016-03-01 10:10:10" }
avro schema
{ "name" : "newsInfo", "type" : "record", "fields" : [{"name" : "name", "type" : "string"}, {"name" : "num", "type" : "int"}, {"name" : "score", "type" : "double"}, {"name" : "newtime", "type" : "string"}] }
but got error:
2016-10-09 10:28:05,503 WARN [put-hive-streaming-0] org.apache.hive.hcatalog.data.JsonSerDe Error [org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric, can not use numeric value accessors
at [Source: java.io.ByteArrayInputStream@3c18563c; line: 1, column: 28]] parsing json text [{"name": "张三", "num": "2", "score": "3.4", "newtime": "2016-03-01 10:10:10"}].
2016-10-09 10:28:05,503 ERROR [Timer-Driven Process Thread-5] o.a.n.processors.hive.PutHiveStreaming PutHiveStreaming[id=d50d1499-3137-1226-89c0-86dfeac7bf2c] Error writing record to Hive Streaming transaction
2016-10-09 10:28:05,505 ERROR [Timer-Driven Process Thread-5] o.a.n.processors.hive.PutHiveStreaming
org.apache.hive.hcatalog.streaming.SerializationError: {metaStoreUri='thrift://hive1.wdp:9083', database='newsinfo', table='test1', partitionVals=[] } SerializationError
at org.apache.nifi.util.hive.HiveWriter.write(HiveWriter.java:119) ~[nifi-hive-processors-1.0.0.jar:1.0.0]
at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$onTrigger$4(PutHiveStreaming.java:480) ~[nifi-hive-processors-1.0.0.jar:1.0.0]
at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1880) ~[na:na]
at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1851) ~[na:na]
at org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:394) ~[nifi-hive-processors-1.0.0.jar:1.0.0]
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) ~[nifi-api-1.0.0.jar:1.0.0]
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1064) ~[na:na]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) ~[na:na]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) ~[na:na]
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) ~[na:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_101]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[na:1.8.0_101]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_101]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[na:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_101]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_101]
Caused by: org.apache.hive.hcatalog.streaming.SerializationError: Unable to convert byte[] record into Object
at org.apache.hive.hcatalog.streaming.StrictJsonWriter.encode(StrictJsonWriter.java:117) ~[hive-hcatalog-streaming-1.2.1.jar:1.2.1]
at org.apache.hive.hcatalog.streaming.StrictJsonWriter.write(StrictJsonWriter.java:78) ~[hive-hcatalog-streaming-1.2.1.jar:1.2.1]
at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.write(HiveEndPoint.java:632) ~[hive-hcatalog-streaming-1.2.1.jar:1.2.1]
at org.apache.nifi.util.hive.HiveWriter$1.call(HiveWriter.java:113) ~[nifi-hive-processors-1.0.0.jar:1.0.0]
at org.apache.nifi.util.hive.HiveWriter$1.call(HiveWriter.java:110) ~[nifi-hive-processors-1.0.0.jar:1.0.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_101]
... 3 common frames omitted
Caused by: org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric, can not use numeric value accessors
at [Source: java.io.ByteArrayInputStream@3c18563c; line: 1, column: 28]
at org.apache.hive.hcatalog.data.JsonSerDe.deserialize(JsonSerDe.java:179) ~[hive-hcatalog-core-1.2.1.jar:1.2.1]
at org.apache.hive.hcatalog.streaming.StrictJsonWriter.encode(StrictJsonWriter.java:115) ~[hive-hcatalog-streaming-1.2.1.jar:1.2.1]
... 8 common frames omitted
Caused by: org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric, can not use numeric value accessors
at [Source: java.io.ByteArrayInputStream@3c18563c; line: 1, column: 28]
at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1433) ~[jackson-core-asl-1.9.13.jar:1.9.13]
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:521) ~[jackson-core-asl-1.9.13.jar:1.9.13]
at org.codehaus.jackson.impl.JsonParserBase._parseNumericValue(JsonParserBase.java:766) ~[jackson-core-asl-1.9.13.jar:1.9.13]
at org.codehaus.jackson.impl.JsonParserBase.getIntValue(JsonParserBase.java:622) ~[jackson-core-asl-1.9.13.jar:1.9.13]
at org.apache.hive.hcatalog.data.JsonSerDe.extractCurrentField(JsonSerDe.java:279) ~[hive-hcatalog-core-1.2.1.jar:1.2.1]
at org.apache.hive.hcatalog.data.JsonSerDe.populateRecord(JsonSerDe.java:218) ~[hive-hcatalog-core-1.2.1.jar:1.2.1]
at org.apache.hive.hcatalog.data.JsonSerDe.deserialize(JsonSerDe.java:174) ~[hive-hcatalog-core-1.2.1.jar:1.2.1]
... 9 common frames omitted
Created 01-27-2017 11:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Avro in HDF is 1.7.7 and timestamp was only introduced in Avro 1.8.x. I would suggest to treat the timestamp field as string.
Created 10-18-2016 08:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What did you use to create the avro schema. Not sure if you knew, but Nifi provides an InverAvroSchema processor, which may help define the type.
Created 01-27-2017 02:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@bhagan : Is timestamp a supported avro datatype?
Created 01-27-2017 03:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Arun, yes, timestamp is a supported datatype. Here is a link to the doc:
Created 01-30-2017 04:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@bhagan : Sorry, I was looking at if timestamp data type could be used in NiFi AVRO related processor. Unfortunately not, with NiFi 1.1.1 (the one I am using) because Avro is still 1.7 and not yet updated to 1.8 there. I believe this is what @Artem Ervits meant by his answer.
Created 01-27-2017 11:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Avro in HDF is 1.7.7 and timestamp was only introduced in Avro 1.8.x. I would suggest to treat the timestamp field as string.
Created 02-01-2017 03:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@boyer if that answers your question, please accept it as best.
