Member since
09-24-2015
816
Posts
488
Kudos Received
189
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3171 | 12-25-2018 10:42 PM | |
| 14192 | 10-09-2018 03:52 AM | |
| 4763 | 02-23-2018 11:46 PM | |
| 2481 | 09-02-2017 01:49 AM | |
| 2911 | 06-21-2017 12:06 AM |
04-15-2016
05:32 AM
1 Kudo
Hi Anandha, (1) The same guide applies to upgrades to 2.4 (2) Just set hive.server2.support.dynamic.service.discovery to true in Ambari. You can try without it, then on the page where you decide between Rolling and Express upgrade Ambari will tell you can you proceed or not.
... View more
04-15-2016
04:46 AM
I'm not sure what exactly do you want to do. What do you mean by "related parameters"? If you want to save all cluster configuration properties before the upgrade you can extract a blueprint curl -u admin:admin -H "X-Requested-By: ambari" http://<ambari-server-fqdn>:8080/api/v1/clusters/:clusterName?format=blueprint and convert its json output to csv and import in Excel. You can do the same after the upgrade and compare output. Some people do that. However, you can check all config property changes using the "History" feature of Ambari by comparing config versions before and after the upgrade for each service. As for the upgrade itself just follow the Ambari Upgrade guide (and don't skip any step!).
... View more
04-15-2016
03:46 AM
1 Kudo
This error means that something is wrong with your Derby database. Can you check is it created? The path do Derby files is given by "Oozie Data Dir" (oozie_data_dir) in Ambari->Oozie->Oozie server and OOZIE_DATA in oozie-env.sh. Check permissions to that path and retry. You can also try to create the DB manually if ooziedb.bat is available, but better go through Ambari.
... View more
04-15-2016
02:36 AM
Okay, great, yes, the error was about "no channel configured". Regarding the path in hdfs, I edited my answer to include the full path in hdfs including the Name node: hdfs://sandbox.hortonworks.com:8020/user/Revathy/Flume/%y-%m-%d/%H%M/%S. It's good to organize your folders in HDFS in some way, here I put your home directory in HDFS. How do one know that the Flulme agent works? Well, if it keeps on running, there are no errors in logs, and if data written to sinks is as expected. You can find a lot of details here. You can also run Flume from Ambari, in which case Ambari will let you know whether Flume process in healthy and running. However, one still has to incepct sinks to be sure.
... View more
04-14-2016
03:31 PM
1 Kudo
It's not recommended to change banned.users and allowed.system.user for security reasons. It's always a good idea to run Yarn jobs as an end user. It's like when you have real users on the cluster, you create their accounts and let them login and run their apps. yarn user is used to manage Yarn, for example by running "yarn rmadmin" and other such commands. If nevertheless you want to try, the only way is to edit the cfj.j2 file located at /var/lib/ambari-server/resources/common-services/YARN/2.1.0.2.0/package/templates/container-executor.cfg.j2.
... View more
04-14-2016
02:45 PM
1 Kudo
Hi @Ran Postar You can reduce "Minimum user ID for submitting job" (min_user_id) in yarn-env in Ambari->Yarn from default 1000 to a smaller value, for example 500. The value is referenced as min.user.id={{min_user_id}} in container-executor.cfg.j2 and it should work.
... View more
04-14-2016
09:34 AM
Check the following two lines in your sink block source_agent.sinks.avro_sink.hdfs.filetype = Datastream
source_agent.sinks.avro_sink.hdfs.a1.sinks.k2.hdfs.path = /Revathy/Flume/%y-%m-%d/%H%M/%S in the first one capitals are not correct, and in the second one the property name on the left side is incorrect. Change them and retry: source_agent.sinks.avro_sink.hdfs.fileType = DataStream
source_agent.sinks.avro_sink.hdfs.path = hdfs://sandbox.hortonworks.com:8020/user/Revathy/Flume/%y-%m-%d/%H%M/%S
... View more
04-14-2016
02:12 AM
1 Kudo
Hi Adi, the threshold means that the utilization of storage on each node after balancing will be (ACU +- threshold) where I use ACU to denote "average cluster utilization". Example: (1) Before adding new nodes: Let's say you have 10 data nodes, each has capacity of 20T, and your data size is 100T. In this case ACU=50% and if all nodes are perfectly balanced, each stores 10T of data. (2) After: Let's say you add 4 large nodes, each with capacity of 50T, and you still have 100T of data. Your total capacity is now doubled to 400T, and therefore ACU=25%. However, your new nodes are empty. Running the balancer with threshold th=10% will ensure that utilization of all nodes is between ACU-th and ACU+th, in this case between 15% and 35%. We are starting with old nodes at 50% and new nodes at 0% of utilization. Balancer will keep on moving data until old nodes' utilization is <= 35% and new nodes' utilization is >= 15%, which means old nodes keeping less than 20*0.35=7T and new nodes keeping more than 50*0.15=7.5T. As you can see, in this particular case data-per-node amounts are not so far away from each other, but as you keep on adding more data the differences will grow up little by little. If you are interested in more details about the balancer, please refer to HADOOP-1652 and the Balancer design document.
... View more
04-13-2016
10:47 AM
Are you on CentOS/RHEL-7? I did recently two upgrades, one using Ambari-2.2.0 and one using 2.2.1.1 (both from HDP-2.2.x) and I had no issues. However, both were on RHEL-6.x
... View more
04-13-2016
01:50 AM
Well, I thought my answer of Apr. 5 ... 🙂
... View more