Member since
10-24-2015
171
Posts
379
Kudos Received
23
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2622 | 06-26-2018 11:35 PM | |
4335 | 06-12-2018 09:19 PM | |
2869 | 02-01-2018 08:55 PM | |
1432 | 01-02-2018 09:02 PM | |
6729 | 09-06-2017 06:29 PM |
03-22-2017
05:33 AM
1 Kudo
@Mateusz Grabowski, Ideally other jobs running in seperate queue ( example: streaming) should not affect your zeppelin processes. You can set up max limit for q_apr_general queue to make sure that minimum 60% of the resources are reserved for default queue. ( set yarn.scheduler.capacity.root.q_apr_general.maximum-capacity=40) reference for capacity scheduler config : https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_performance_tuning/content/section_create_configure_yarn_capacity_scheduler_queues.html Regarding spark sql execution time, There have been few reports of slow execution with spark sql via zeppelin. Apache Jira tracking this issue : ZEPPELIN-323 https://community.hortonworks.com/questions/33484/spark-sql-query-execution-is-very-very-slow-when-c.html
... View more
03-22-2017
12:55 AM
1 Kudo
@Sree Kupp, 1. Both the Spark Thrift Servers keep failing suddenly out of the blue. I am not sure if it is some configuration issue (like not having enough heap size so even though it starts up when I start it, eventually it fails). A cluster can have spark1 and spark2 thrift server running together. Is spark1 and spark2 thrift server deployed on same host ? Can you please check what is the error message for spark thrift server failure? 2. Can I have both the Sparks running simulatneously? Or will that cause any memory overload on the cluster? Yes , you can have both the spark running simultaneously. Regarding memory overload, If you are using yarn-client or yarn-cluster mode to run the spark applications, It won't memory overload the client machine. 3. In the ODBC Driver DSN setup, when I click on "Test" option, sometimes it fails even when the thrift server is up and running. The error is: "[Hortonworks][Hardy] (34) Error from server: connect() failed: errno = 10061." I found few good links to handle this issue. Seems like many people hit similar issue. I hope this helps. http://kb.tableau.com/articles/issue/error-connect-failed-hadoop-hive https://community.hortonworks.com/questions/33046/hortonworks-hive-odbc-driver-dsn-setup.html https://community.hortonworks.com/questions/10192/facing-issue-with-odbc-connection.html
... View more
03-19-2017
08:25 PM
4 Kudos
@shiremath, Found few blogs which can help. Fault Injection and Elastic Partitioning Hadoop code injection distributed fault injection
... View more
03-19-2017
08:03 PM
5 Kudos
@Ward Bekker, Firstly, find out the correct configuration in spark to occupy a full cluster. You will need to tune num of executors, executor cores, memory, driver memory etc. References: https://community.hortonworks.com/questions/56240/spark-num-executors-setting.html http://stackoverflow.com/questions/37871194/how-to-tune-spark-executor-number-cores-and-executor-memory After figuring out the correct configs, you can use one of the below approaches to set up zeppelin and livy interpreter. 1) You can set SPARK_SUBMIT_OPTIONS in conf/zeppelin-env.sh to specify number of executors, executor cores, memory, driver memory etc . ( This config will be applied on all the spark & livy interpreters ) export SPARK_SUBMIT_OPTIONS="--num-executors X --executor-cores Y --executor-memory Z" 2) Set configs in livy interpreter. Open livy interpreter page and add below configs in livy interpreter. livy.spark.executor.instances X
livy.spark.executor.cores Y
livy.spark.executor.memory Z
... View more
03-17-2017
12:30 AM
1 Kudo
Found one tutorial for Azure Vms. Here they are using " ssh root@127.0.0.1 -p 2222" . can you try that ? https://hortonworks.com/hadoop-tutorial/learning-the-ropes-of-the-hortonworks-sandbox/
... View more
03-16-2017
11:59 PM
1 Kudo
yes, that sounds right.
... View more
03-16-2017
10:54 PM
1 Kudo
@Suzanne Dimant, Also make sure yo do ssh to docker container, not the VM. Refer to https://community.hortonworks.com/questions/68334/-bash-ambari-admin-password-reset-command-not-foun.html https://community.hortonworks.com/questions/58247/hdp-25-sandboxvm-commandsscripts-are-not-found.html
... View more
03-16-2017
10:51 PM
1 Kudo
In order to ambari-agent-password-reset to work, the agent should be running fine. Can you please check ambari agent logs? You can find ambari-agent logs at /var/log/ambari-agent. Let's check if it has any error/exceptions.
... View more
03-16-2017
09:50 PM
1 Kudo
@Suzanne Dimant, From the output of ps -ef | grep Ambari, It seems ambari server (pid=4772) and agent (pid=7473) are running. There has been few issues noticed in HDP2.5 sandbox regarding ambari login. Please follow below HCC thread. https://community.hortonworks.com/questions/57064/hdp25-on-virtualbox-and-ambari-login-url.html
... View more
03-16-2017
09:09 PM
6 Kudos
@Faisal R Ahamed, You should use spark-submit to run this application. While running application specify --master yarn and --deploy-mode cluster. Specifying to spark conf is too late to switch to yarn-cluster mode. spark-submit --class <clasname> --master yarn --deploy-mode cluster <jars> <args> https://www.mail-archive.com/user@spark.apache.org/msg57869.html
... View more