About psingh15

psingh15 · ‎07-18-2016

You can refer https://community.hortonworks.com/questions/9790/orgapachehadoopipcstandbyexception.html for this issue.

psingh15 · ‎07-18-2016

Hi Praveen, Here are a few points to help: 1. Try running your API without options like "--driver-memory 15g --num-executors 25 --total-executor-cores 60 --executor-memory 15g --driver-cores 2" and check logs for memory allocated to RDDs/DataFrames. 2. Driver doesn't need 15g memory if you are not collecting data on driver. Try setting it to 4g rather. I hope u r not using .collect() or similar operations which collect all data to driver. 3. The error needs fine tuning your configurations between executor memory and driver memory. The total number of executors(25) are pretty much higher considering the memory allocated(15g). Reduce number of executors and consider allocating less memory(4g to start with). Thanks, Puneet

psingh15 · ‎07-18-2016

Could you share more details like command used to execute and input size?

psingh15 · ‎07-04-2016

Yes. A projection before any sort of transformation/action would help in computation time and storage optimization.

psingh15 · ‎07-02-2016

Yes. Reducing size of dataset before JOIN would surely help rather than other way round.

psingh15 · ‎07-02-2016

There are scenarios(though bad) where data insertion requires the ordering of columns to be in Lexicographical Sorting while inserting data into db using JDBC connection. Not sure if jestin ma is facing similar issue.

psingh15 · ‎07-02-2016

@Jestin: Why do you need sorting columns in dataframes? Could u please elaborate. However in Java there is no inbuilt function to reorder the columns.

psingh15 · ‎05-14-2016

Yes, relying on Spark logs is a solution to this but it does take away the freedom to log custom messages. So, what I am expecting is some solution like SparkContext.getLogger().info("message") which will be Lazy evaluated when the action is called at last.

psingh15 · ‎05-14-2016

Thanks for the input. Yes that is a solution but I don't want to call any action as I mentioned. So, what I am expecting is some solution like SparkContext.getLogger().info("message") which will be Lazy evaluated when the action is called at last.

psingh15 · ‎05-13-2016

I am trying to capture the logs for my application before and after the Spark Transformation statement. Being Lazy in evaluation the logs get printed before a transformation is actually evaluated. Is there a way to capture logs without calling any Spark action in log statements, avoiding unnecessary CPU consumption?

Online	Offline
Last Visited	‎09-21-2018 03:49 PM

Member Since	‎05-05-2016 09:12 AM
Last Visited	‎09-21-2018 03:49 PM
Posts	18
Kudos received	16

Cloudera Community

Re: Joining large tables (dataframe and sql), but ...

Re: Spark Dataframes: How can I change the order o...

Re: Spark SQL Job stcuk indefinitely at last task...

Re: Spark SQL Job stcuk indefinitely at last task...

Re: Spark SQL Job stcuk indefinitely at last task...

Re: Joining large tables (dataframe and sql), but ...

Re: Joining large tables (dataframe and sql), but ...

Re: Spark Dataframes: How can I change the order o...

Re: Spark Dataframes: How can I change the order o...

Re: How to do logging in Spark Applications withou...

Re: How to do logging in Spark Applications withou...

How to do logging in Spark Applications without us...