Member since
08-14-2017
8
Posts
0
Kudos Received
0
Solutions
10-23-2017
05:10 AM
Hi @Slim Given that this dataset already loaded into HIVE, and the Hive table will be updated occasionally*. What are my chances in using druid to index this data and use superset to visualise the data(Without replicating in druid) ? And how would you recommend this approach ? *Will druid automatically update indexes when data is added to HIVE?
... View more
10-13-2017
06:42 AM
Thanks a lot for the quick reply. Let me do a setup with a smaller data set and get back to you with some questions. 😄
... View more
10-12-2017
05:42 AM
Referring to this article https://hortonworks.com/blog/apache-hive-druid-part-1-3/ by @Carter Shanklin We have a 12TB+ (growing ~8GB per day) data set of click stream data(user events) in Hive. The usecase is to run OLAP queries across the data set, for now mostly groupby. How will this combination perform in context to the data set ? Also how production ready is the combination.
... View more
Labels:
- Labels:
-
Apache Hive
09-21-2017
07:22 AM
@Sriharsha Chintalapani I getting the same error, cluster update didn't solve the issue. I'm trying to consume plain CSV(roshan,22) to Streamline via NIFI(to convert CSV to AVRO). Kafka > NIFI > Kafka. From Storm I'm getting the error similar to above. com.hortonworks.registries.schemaregistry.serde.SerDesException: Unknown protocol id [114] received while deserializing the payload at com.hortonworks.registries.schemaregistry.serdes.avro.AvroSnapsh My Flow is as follows, One branch directly to a Kafka topic and other one to serialize and publish to kafka. After running data through the PublishKafkaRecord the output simply removes the "," (roshan,22 turns to roshan22) and the above mentioned error appears in Storm. I'm really new to this stack any help would be appreciated. Avro schema {
"type": "record",
"name": "user",
"fields": [
{
"name": "name",
"type": "string",
"default": null
},
{
"name": "age",
"type": "string",
"default": null
}
]
}
Update Attribute Processor Publish Kafka Record Processor CSVReader Controller Service AvroRecordSetWriter Controller Service
... View more
09-20-2017
05:56 AM
@mkalyanpur CSVReader 1.2.0.3.0.1.1-5 & AvroRecordSetWriter 1.2.0.3.0.1.1-5 are as follows. And my avro schema in the registry is similar to this with bunch of more string fields. {
"type": "record",
"name": "tracking_sdk_event",
"fields": [
{
"name": "timeStamp",
"type": "long",
"default": null
},
{
"name": "isoTime",
"type": "string",
"default": null
}
]
} @Bryan Bende After changing the "Schema Write Strategy" to "Hortonworks Content Encoded Schema Reference" I'm getting an error with the timeStamp field. I have attached an image of it.
... View more
09-19-2017
06:22 AM
I'm trying to upgrade a existing visualization(Kafka>Flink>Druid>Superset) solution to work with HWX SAM & Registry. Currently the NIFI Works as a HTTP proxy to collect events and push to kafka, I'm trying to convert the events(CSV) to avro in this stage and push to kafka so that SAM can consume. Output of the SplitContent is something similar to "abc,def,ghi,jkl,," I'm getting this error in storm UI com.hortonworks.registries.schemaregistry.serde.SerDesException: Unknown protocol id [49] received while deserializing the payload at com.hortonworks.registries.schemaregistry.serdes.avro.AvroSnapsho Is there something I should pay closer attention to when processing CSV? Troubleshooting recommendations ?
... View more
Labels:
- Labels:
-
Apache Storm