About buntu

buntu · ‎01-15-2015

I'm using the patch from FLUME-2578 but when I pass the patched jar via --classpath option of flume-ng, I see the old jar being used instead of the patched jar since it is being appended. Is there a way to override or append the patched jar to java classpath and run flume agent? Thanks!

buntu · ‎01-12-2015

In Cloudera Manager, where can I find the list of available versions of services (Flume, Hive, Impala etc.) are installed in the cluster? Thanks!

buntu · ‎01-05-2015

Found the metrics under Cloudera Manager->Flume->Charts Library Thanks!

buntu · ‎01-05-2015

I'm currently using Flume with CDH 5.3.0 with Kite Dataset sink to store as Avro. I would like to get some insights into the data flowing into the cluster. How do get any sort of metrics on the data throughput, performance etc ? Thanks!

buntu · ‎12-11-2014

Does Impala 2.0 support nested data like map or is it expected in future releases? Thanks!

buntu · ‎08-04-2014

Thanks Sean.. I'm currently computing uniques visitors per page and running a count distinct using SparkSQL. We also run the non-spark jobs on the cluster, so if we allocate the 2GB I'm assuming we can't run any other jobs simultaneously. Also, I'm also looking to see how to set the storage levels in CM.

buntu · ‎08-04-2014

I got a 40 node cdh 5.1 cluster and attempting to run a simple spark app that processes about 10-15GB raw data but I keep running into this error: java.lang.OutOfMemoryError: GC overhead limit exceeded Each node has 8 cores and 2GB memory. I notice the heap size on the executors is set to 512MB with total set to 2GB. Wanted to know whats the heap size needs to be set to for such data sizes? Thanks for the input!

buntu · ‎07-28-2014

After removing the import, I was able to compile the package successfully.

buntu · ‎07-28-2014

Thanks Sean, now I get this: error: object SQLContext is not a member of package org.apache.spark.sql [INFO] Note: class SQLContext exists, but it has no companion object. [INFO] import org.apache.spark.sql.SQLContext._

buntu · ‎07-28-2014

I'm creating a simple SparkSQL app based on this post by Sandy: http://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/ But 'mvn package' gives throws error: error: object sql is not a member of package org.apache.spark Any idea if I need to include any other dependency? Thanks!

Online	Offline
Last Visited	‎10-18-2018 01:40 AM

Member Since	‎07-21-2014 02:20 PM
Last Visited	‎10-18-2018 01:40 AM
Posts	141
Kudos received	8

Cloudera Community

Re: CDH parcel repo URL

Re: NPE if kafka has null record key

Re: Flume metrics

NPE if kafka has null record key

How to look at the versions of the services?

Re: Flume metrics

Flume metrics

Impala support nested data

Re: Spark app throwing java.lang.OutOfMemoryError:...

Spark app throwing java.lang.OutOfMemoryError: GC ...

Re: error: object sql is not a member of package o...

Re: error: object sql is not a member of package o...

error: object sql is not a member of package org.a...