About raghavendran_c

bleonhardi · ‎04-29-2016

Its a good question, assuming the source of entropy is good the chances of a duplicate are essentially 0 ( randomUUID has 2^122 permutations which is roughly the number of atoms in the universe ) There are other ways too however, I assume there are some ready made solutions out there but how about using some old fashioned MapReduce: Just one way: Assuming you could create all the UUIDs in one go and you had the data stored in a delimited format, you could create a unique key based on the long offset provided for each line by Textinputformat. TextInputFormat provides lines of text together with a long offset ( bytes from the start using the split offsets ), so you could just add this to a starting number ( for example have a batchid that is steadily increased ) and create a unique number that way. There are definitely other ways to do it too. For example going through a MapReduce jobid + taskid + rowinsplitid.

aervits · ‎01-18-2016

@Raghavendran Chellappa the release notes for HDP are the source of truth, Jira is external to Hortonworks. If Release notes say we support it, then that's the way to go. What is it that you find not working? We have Kafka 0.8.2.0 stable not beta. Kafka 0.9 also works. We usually backport critical features. Also, have you tried Nifi, it has all the latest Kafka support including Kerberos.

phargis · ‎01-07-2016

Actually, many BI vendors including Tableau have announced a Spark Connector over JDBC, which should presumably be able to leverage data loaded into RDD's in memory. If you load data via Spark Streaming into RDD, then either schematize it (rdd.registerTempTable) or convert to DataFrame (rdd.toDF), you should be able to query that data from a JDBC connection and display in dashboard. Here is info on Tableau connector, including a video at bottom of page: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&so...

Online	Offline
Last Visited	‎12-21-2016 07:46 PM

Member Since	‎12-14-2015 02:56 PM
Last Visited	‎12-21-2016 07:46 PM
Posts	8
Kudos received	6

Cloudera Community

Re: Using java.util.UUID.randomUUID() for UUID gen...

Re: Is Flume to Kafka Sink supported in HDP 2.3.2?

Re: Use of Spark Streaming for interactive Reporti...