Member since
10-09-2015
76
Posts
33
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4924 | 03-09-2017 09:08 PM | |
5261 | 02-23-2017 08:01 AM | |
1696 | 02-21-2017 03:04 AM | |
2049 | 02-16-2017 08:00 AM | |
1080 | 01-26-2017 06:32 PM |
12-12-2016
10:10 PM
1 Kudo
Your standby RM (rm1) must be the first RM in the configured list of RMs. So its tried first and that results in exceptions.
... View more
12-10-2016
11:21 PM
1 Kudo
Either the Tez application did not start - in which case you will not find any YARN application for this. OR Your Tez application started but did not receive any DAG to run and timed out. You will find the exact reason in the YARN application master log for this job. OR Your Tez application started and crashed unexpectedly. You will find the exact reason in the YARN application master log for this job.
... View more
12-10-2016
01:45 AM
Yes. If max capacity for a queue is 50% then it will not be allocated resources > 50% even if the cluster is idle. Obviously this can waste free capacity. Hence this is set to 100%. Hence preemption becomes important for timeliness of giving resources to other queues. Your configs look ok at first glance but you should check here and here about the configs. You may have to play around with the configs before you get the desired response times for your preemption. If you do not see preemption happening even if its properly configured then you should open a support case (in case you have a support relationship).
... View more
12-09-2016
10:53 PM
There are multiple things here that may be at play given your HDP version To be clear you have setup a YARN queue for Spark with 50% capacity but Spark jobs can take up more than that (up to 100%) and since these are long running executors, the cluster is locked up until the job finishes. Is that correct? If yes, then lets see if the following helps. This might be verbose to help other users (in case you already know about these things :)) 1) YARN schedulers, fair/capacity, will allow jobs to go to max capacity if resources are available. Given your spark queue is configured to have max=100% this is allowed. So that explains why Spark jobs can take over your cluster. The difference between fair and capacity is that for concurrent jobs that ask for resources at the same time, capacity will prefer the first job and fair will share across all jobs. However if a job has already taken over the cluster, neither will be able to give other queues resources until the job itself returns resources back. This is if preemption is not enabled. 2) YARN schedulers, fair/capacity, support cross queue preemption. So if queue 2 is over its capacity and queue 1 needs resources then resources between max-capacity and capacity will be preempted from 2 and give to 1. Have you enabled preemption in the scheduler for your queues. That should trigger preemption of excess capacity from the Spark queue to other queues when needed. IIRC, this is how it should behave regardless of fair vs capacity scheduling if new small jobs come in after an existing job has taken over the cluster. Perhaps you could compare your previous fair settings vs new capacity settings to check if preemption is enabled in the former but not in the latter.
... View more
12-09-2016
10:31 PM
No I dont think Spark will uncache a different data set when a new one is cached. How are you going to load balance or failover from one STS to another?
... View more
12-09-2016
02:50 AM
Dont forget to uncache the old data 🙂 Also, each STS has its own SparkContext which will be lost if that STS is lost. So there is no way currently to have availability of the cache inside an STS if that STS goes down. Having 2 identical STS instances with identical caches is possibly the only solution. Assuming your cache creation code is consistent.
... View more
12-09-2016
02:47 AM
To confirm, the issue is that hbase conf was not available to spark. You can also check the Spark HBase Connector we support at https://github.com/hortonworks-spark/shc. It has many features but also documents the configuration for Spark Hbase access and security aspects too.
... View more
12-08-2016
08:19 PM
1 Kudo
Need to set new Python location via env variable PYSPARK_PYTHON
... View more
12-07-2016
10:31 PM
1 Kudo
Spark Thrift Server (STS) runs its SparkContext within the STS JVM daemon. That Spark Context is not available to external clients (with or without spark submit). The only way to access that Spark context is via the JDBC connection to STS. After your external processing has completed, you could submit a cache refresh operation to your STS.
... View more
10-14-2016
10:31 AM
Yes. It means encrypting all network transfers within the Spark job. There are no other avenues for wire encryption within Spark. Starting Spark 2.0 enabling wire encryption also enables https on the history server UI for browsing historical job data.
... View more
- « Previous
- Next »