About AsimShaikh

AsimShaikh · ‎11-25-2022

Hi @bulbcat - Spark version used in CDH 6.3.2 is Spark 2.4.0 [1] - Also spark.yarn.priority will only effect from spark-3.0 [2] [1]https://docs.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_63_packaging.html [2]https://github.com/apache/spark/pull/27856 If you want to update priority for application during runtime you can use below yarn application -updatePriority 10 -appId application_xxxx_xx Thanks!

AsimShaikh · ‎11-21-2022

Hi @Tellyou6bu6 If you are installing trail version of CDP , It will use embedded Postgre for installation. Installing a Trial Cluster In this procedure, Cloudera Manager automates the installation of the Oracle JDK, Cloudera Manager Server, embedded PostgreSQL database, Cloudera Manager Agent, Runtime, and managed service software on cluster hosts. Cloudera Manager also configures databases for the Cloudera Manager Server and Hive Metastore and optionally for Cloudera Management Service roles." You can check below link for detail explanation. https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/installation/topics/cdpdc-trial-installation.html Regards, Asim

AsimShaikh · ‎11-18-2022

@shamly Can you share full stack trace?

AsimShaikh · ‎11-18-2022

@drgenious How you upgraded impala? You can check "strace -s 2000 impala-shell -i hostname -k --ca-cert.pem -q "show databases > impala_log to verify if is referring correct impala version. Thanks!

AsimShaikh · ‎11-17-2022

@pankshiv1809 was your question answered? Make sure to mark the answer as the accepted solution. If you find a reply useful, say thanks by clicking on the thumbs up button.

AsimShaikh · ‎11-16-2022

@pankshiv1809 You can review below blogs for tuning spark applications based on your case you need to tune executer,driver memories and cores along with other parameters mentioned in below blog. https://blog.cloudera.com/how-to-tune-your-apache-spark-jobs-part-1/ https://blog.cloudera.com/how-to-tune-your-apache-spark-jobs-part-2/ Thanks!

AsimShaikh · ‎11-02-2022

Hi @Bro Are you able to fetch logs using below? 1] yarn logs -applicationId <app_id> -appOwner <user> 2] are you able to see application_id in JHS or Cloudera>Yarn>Applications ? Check this property once yarn.resourcemanager.max-completed-applications Thanks!

AsimShaikh · ‎11-01-2022

You have sample code which you can share?

AsimShaikh · ‎10-31-2022

You may need to explicitly stop the SparkContext sc by calling sc.stop. it's a good idea to call sc.stop(), which lets the spark master know that your application is finished consuming resources. If you don't call sc.stop(), the event log information that is used by the history server will be incomplete, and your application will not show up in the history server's UI.

AsimShaikh · ‎10-30-2022

Hello @r4ndompuff Are you able to fetch logs for this application from command line? yarn logs -applicationId <app_id> -appOwner <user> Possibly, when there are huge number of application count stored that is expected to cause this issue. In general, large /tmp/logs (yarn.nodemanager.remote-app-log-dir) HDFS directory causes YARN log aggregation to time out. Regarding killing application, this must be code level issue you need to check if sc.close() method has been called at correct place. Thanks!

Online	Offline
Last Visited	‎06-30-2023 07:16 AM

Member Since	‎04-04-2022 04:39 AM
Last Visited	‎06-30-2023 07:16 AM
Posts	79
Kudos received	5

Cloudera Community

Re: write is slow in hdfs using pyspark

Re: particular nodemanagers nodes memory reaches >...

Re: Queue Manager best practices

Re: Version mapping between CDH and Apache communi...

Re: Version mapping between CDH and Apache communi...

Re: how to set the yarn application priority in hi...

Re: DATABASE SETTING AFTER ASSIGNING ROLES

Re: spark exception when reading a parquet file

Re: unexpected keyword argument 'ssl_version'

Re: Spark Submit - Spark Parameter Setting

Re: Spark Submit - Spark Parameter Setting

Re: history task records disappear after hadoop re...

Re: Spark application in incomplete section of spa...

Re: Spark application in incomplete section of spa...

Re: Spark application in incomplete section of spa...