Member since
05-02-2016
74
Posts
41
Kudos Received
14
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3780 | 07-11-2018 01:40 PM | |
7520 | 01-05-2017 02:43 PM | |
1694 | 12-20-2016 01:17 PM | |
1569 | 12-02-2016 07:19 PM | |
2383 | 10-06-2016 01:29 PM |
03-06-2020
06:06 PM
@sri_man
Since this thread was marked 'Solved' back in 2016, you would have a better chance of receiving a relevant response by posting a new question. This will also provide the opportunity to provide details specific to your environment that could aid other members in providing a more tailored answer to your issue.
... View more
04-19-2017
11:52 AM
These checklists help ensure that you avoid unexpected problems when adding new hosts to an HDP cluster.
Preparation checklist (what to do before adding the new hosts via the Ambari Add Host wizard):
Have you followed the guidance in the Ambari installation documentation (e.g. example install doc is here)? It is recommended to follow the instructions for the HDP version that is installed on the cluster (go to http://docs.hortonworks.com).
Does your file system partitioning scheme, storage options, memory, network, and power configurations match what is used for other nodes on the cluster? It is good to either follow the pattern of what is already on the cluster or to follow recommendations from docs.hortonworks.com (see example doc here). Compare and verify that settings and configurations match, including: THP, selinux, ssh keys, user integration, LDAP/AD setup, SSSD integration, drive layouts, etc.
Have you verified that JAVA_HOME is set properly with same Java version across the cluster? Having mismatched JDK versions across your nodes may cause failures and should be remediated for optimal cluster operation.
Do the new nodes have the same version of linux, yum, rpm, scp, python and curl as the pre-existing cluster nodes?
Do all the /etc/hosts files match and reflect the newly added and existing nodes? Check DNS across the nodes to confirm that host names resolve properly
Have you enabled NTP on all the newly added nodes and make sure they are in sync with other nodes?
Have you confirmed that iptables are updated or turned off? If you have iptables or firewall settings for port make sure the ports open for new hosts.
Have you confirmed that the user that Ambari runs as can communicate with new nodes via passwordless SSH from ambari-server to the new nodes?
Have you verified that the Ambari agents on the new nodes match the ambari-agent version on the existing nodes?
If using SSL (for HDFS, YARN, etc.), have you installed certificates on the new nodes?
In case if you are adding the new nodes pulling them from other cluster, make sure you clear the all the previous installs or reformat is good option.
Post node addition checklist:
When adding the new hosts to the cluster (see “Adding Hosts to a Cluster” in the Ambari User Guide), have you set the rack properly for the new nodes?
Do you have a strategy in place for rebalancing the cluster that accounts for cluster load? Consider rebalancing in quiet times or running with low bandwidth and throttled if the cluster load will be high during the rebalancing.
Have you run smoke tests that start an Application Master on one of the new nodes?
... View more
01-23-2018
07:00 PM
@clukasik In the last line of the script, you have alertHost s `hostname`
I am wanting to send the FQDN of the host, but if I put alertHost s `hostname -f`
the monitoring service I am sending the trap to never receives the trap at all. Why is this? Do you have any recommendations on how I can send the FQDN?
... View more
08-09-2016
07:35 AM
HI Pierre, We would need to look at the code. Can you a do a persist just before stage 63 and before stage 65 check the spark UI storage tab and executor tab for data skew. If there is data skew, you will need to add a salt key to your key. You could also look at creating a dataframe from the RDD rdd.toDF() and apply UDF on it. DF manage memory more efficiently. Best, Amit
... View more
02-13-2017
02:36 PM
Good point. For my Sandbox testing, I decided to just use the steps provided in http://stackoverflow.com/questions/40550011/zeppelin-how-to-restart-sparkcontext-in-zeppelin to stop the SparkContext when I need to do something outside of Zeppelin. Not ideal, but working good enough for some multi-framework prototyping I'm doing.
... View more
07-25-2016
07:27 PM
I am running this on the HDP Sandbox VM. I changed zeppelin.server.addr to sandbox.hortonworks.com, which is the /etc/hosts entry that points to 127.0.0.1 on my machine, and this resolved the issue in Chrome.
... View more
05-04-2016
06:46 PM
I suppose is the issue with loading data. Try to create external table instead.. create EXTERNAL table tweets
....
row format serde 'org.openx.data.jsonserde.JsonSerDe'
LOCATION '/tmp/tweets_staging/';
... View more