About Chandra

Chandra · ‎11-03-2017

Thanks very much. I see now whats going on. I tried both of your suggestions and seem to work well

Chandra · ‎09-12-2017

Figured out that I had to use dataset to make sure checkpointing works....

Chandra · ‎08-03-2017

Thanks much for your response.

rhardaway · ‎08-02-2017

Doesn't seem like streaming data directly to HDFS will make it very easy to find/aggregate at the end of each window? What about creating a key/value store (with reddis, hbase, or elasticSearch for example) and using it to lookup all the keys associated with each window.

lweichberger · ‎06-09-2017

I wrote about this in my Spark Structured Streaming blog here: https://www.linkedin.com/pulse/spark-21-structured-streaming-databricks-laurent-weichberger See this sample: val query = inactive.writeStream .format("parquet") .option("path", "/com/infotrellis/spark") .option("checkpointLocation", "/com/infotrellis/check") .start() query.awaitTermination()

Chandra · ‎11-03-2017

Hi @Greg Keys, Thanks for the post. Row filtering works based on the column values which is not in the end. But I am not sure how to filter the rows based on the last column value. Can you please let me know. Thanks

Chandra · ‎10-05-2016

Thank you very much for your insight.

aervits · ‎10-01-2016

Apache Nifi is more feature-rich, battle tested and servers many purposes. Simply, it has bidirectional flow whereas Flume only moves data to HDFS. There's also visual UI for real time command and control as opposed to Flume with only configuration property files to deal with. If you are in the beginning stages, do yourself a favor and go with Nifi.

Chandra · ‎09-02-2016

Also what is the need to run Hive queries on SparkSql when Hive on Tez can run much faster....

Chandra · ‎06-15-2016

thanks much

Online	Offline
Last Visited	‎09-27-2017 04:06 PM

Member Since	‎09-03-2015 08:31 AM
Last Visited	‎09-27-2017 04:06 PM
Posts	50
Kudos received	8

Cloudera Community

Re: Failing Checkpoint Spark Streaming

Re: Spark Streaming Creating Small files in Hive

Re: Failing Checkpoint Spark Streaming

Re: Spark-sklearn integration

Re: Window Operations on Spark Streaming

Re: Error in Spark Streaming - Kafka integration S...

Re: NiFi ETL: Removing columns, filtering rows, ch...

Re: Stream Processor Selection

Re: Log Analysis - Data Ingestion

Re: HIve on Tez or HIve query using Spark SQL

Re: Name Node and Data Node Directories