About afernandez

afernandez · ‎05-03-2017

Problem: There's a known bug in Ambari 2.4 and 2.5 that causes "ambari-server upgrade" to fail if the agent RPM is not upgraded first. E.g. stack trace: Using python /usr/bin/python Setup ambari-server Traceback (most recent call last): File "/usr/sbin/ambari-server.py", line 33, in from ambari_server.dbConfiguration import DATABASE_NAMES, LINUX_DBMS_KEYS_LIST File "/usr/lib/python2.6/site-packages/ambari_server/dbConfiguration.py", line 28, in from ambari_server.serverConfiguration import decrypt_password_for_alias, get_ambari_properties, get_is_secure, \ File "/usr/lib/python2.6/site-packages/ambari_server/serverConfiguration.py", line 36, in from ambari_commons.os_utils import run_os_command, search_file, set_file_permissions, parse_log4j_file ImportError: cannot import name parse_log4j_file Cause: This occurs because os_utils.py and other python files inside of /usr/lib/ambari-agent/lib/ambari_commons are upgraded by the agent's RPM and are used by the server's scripts to find which database to use. Solution: Note: Always back up your Ambari database before the upgrade. If ambari-agent is also present on the Ambari Server host, run "yum upgrade ambari-agent" (or equivalent for your OS).

afernandez · ‎01-31-2017

Whether you're creating an Ambari cluster from scratch, taking over an existing cluster, or growing your cluster over time, it is imperative to tune Ambari and MySQL to work at a large scale of 1000-3000 Ambari Agents. Ambari Server Configs First, increase the memory used by Ambari. For large clusters, 8 GB of memory should be sufficient. If you have more than 10 concurrent users, increase it to 16 GB. Edit /var/lib/ambari-server/ambari-env.sh and change the -Xmn setting. export AMBARI_JVM_ARGS=$AMBARI_JVM_ARGS' -Xms2048m -Xmx8192m Edit /etc/ambari-server/conf/ambari.properties with the following configs # The size of the Jetty connection pool used for handling incoming Ambari Agent requests. # 10 hosts => 25 # 50 hosts => 35 # 100 hosts => 75 # 500 hosts => 100 agent.threadpool.size.max=100 # Determines whether current alerts should be cached. # Enabling this can increase performance on large cluster, but can also result in lost alert data # if the cache is not flushed frequently. alerts.cache.enabled=true # The size of the alert cache. # Less than 50 hosts => 50000 # More than 50 hosts => 100000 alerts.cache.size=100000 # The number of threads used to handle alerts received from the Ambari Agents. # The value should be increased as the size of the cluster increases. # Less than 50 hosts => 2 # More than 50 hosts => 4 alerts.execution.scheduler.maxThreads=4 After performing these changes, restart Ambari Server. Move an existing Ambari DB from a spinning disk to a SSD It is highly suggested to use a Solid State Drive for the Ambari Database since this will be much faster. Check the throughput of the disk in which Ambari’s database (Postgres, MySQL, MariaDB, or Oracle) is on.Ideally, it should be a Solid State Drive or support at least 200 IOPS and be either on the same host as Ambari or only a 1-2 hops away. Type Details IOPS Throughput HDD 10,000 rpm SAS drive 175-210 100 MB/s SSD solid-state 500+ 500+ MB/s 1. ambari-server stop 2. Take a backup of the Ambari database, mysqldump -u root ambari > /tmp/ambari.sql 3. Stop MySQL server, copy its data, and change the directory. service mysqld stop cp -R -p /var/lib/mysql /mnt/disks/ssd/mysql cat /etc/my.cnf sed -ie 's/\/var\/lib\/mysql/\/mnt\/disks\/ssd\/mysql/g' /etc/my.cnf 4. Create symlink for sock file and start MySQL ln -s /mnt/disks/ssd/mysql/mysql.sock /var/lib/mysql/mysql.sock service mysqldstart 5. Ensure Ambari DB is accessible. mysql -u root -p show databases; use ambari; show tables; select count(*) from hosts; MySQL Optimizations First and foremost, if you're on an older version of MySQL, you can try to update it to MySQL 5.6 or 5.7, which has a lot of performance improvements. Connect to the MySQL DB and inspect these variables. E.g., SHOW VARIABLES LIKE 'name';These suggested values assume that only Ambari Database’s is on the MySQL Server.If you have other databases in the same MySQL Server, increment by these values. WARNING: Never stop MySQL server while Ambari Server is running. Variable Suggested Value innodb_log_buffer_size 512M innodb_buffer_pool_size 16G innodb_file_io_threads (deprecated in MySQL 5.5) 16 innodb_log_file_size 5M innodb_thread_concurrency 32 join_buffer_size 512M key_buffer_size 16G max_connections 500 max_allowed_packet 1024M max_heap_table_size 64M query_cache_limit 16M query_cache_size 512M read_rnd_buffer_size 128M sort_buffer_size 128M table_open_cache 1024 thread_cache_size 128 thread_stack 256K To change these values.1. Stop MySQL: service mysqld stop 2. Edit the configs in /etc/my.cnf , under the “[mysqld]” section (note, it may be in a different location). 3. Start MySQL: service mysqld start

smagyari · ‎01-02-2017

Hi @wbu Right the problem here is that 'users.admin' property must be set on 'activity-zeppelin-shiro' config type not on 'hst-server-conf'. I've tried it and seems to work fine. BR Sandor

ponko73 · ‎05-09-2018

If that doesn't work, try running as the HDFS superuser sudo -u hdfs hdfs dfsadmin -safemode leave sudo -u hdfs hdfs dfs -mkdir /user/admin sudo -u hdfs hdfs dfs -chown root:hdfs /user/admin

afernandez · ‎05-31-2016

That conf-select call tries to create the following symlink. /usr/hdp/current/zookeeper-server/conf -> /etc/zookeeper/2.4.0.0-169/0 1. Make sure zookeeper server is actually installed and Ambari shows it as a component for that host. rpm -qa | grep zookeeper_.*server 2. Set its symlink. hdp-select set zookeeper-server 2.4.0.0-169 # this will create symlink /usr/hdp/current/zookeeper-server -> /usr/hdp/2.4.0.0-169/zookeeper 3. Restart ZK Server on that host via the UI.

afernandez · ‎04-25-2016

When performing a Rolling or Express Upgrade, failures can naturally happen because large clusters are bound to have problematic hosts. Here are 10 easy tips to prevent, diagnose and fix errors. Before upgrading the stack ... 1. Always upgrade Ambari to the most recent version, even if it's a dot release. Often, there are fixes and optimizations that make the stack upgrade smoother. 2. Ensure all services are up, service checks are passing, there are no critical alerts, etc. This helps ensure that the cluster is fully operational and helps to isolate any failures. 3. Pre-Install the bits and make sure all hosts have enough disk space. You can check that the version is found on all hosts. E.g., hdp-select versions | grep 2.5.0.0 | sort | tail -1 4. Do not ignore warnings. Starting in Ambari 2.2.2, there's a flag in ambari.properties file that allows users to bypass PreCheck errors, make sure it is either not present or set to false, stack.upgrade.bypass.prechecks=false 5. Take a backup of the Ambari database. E.g., pg_dump -U ambari ambari > /tmp/ambari_bk.psql mysqldump -u ambari ambari > /tmp/ambari_bk.mysql In the middle of Upgrade ... 6. Rolling Upgrade will pause after 30% of the DataNodes have been upgraded. This allows the customer to run additional jobs and ensure that the partial upgrade is still healthy. 7. If a failure occurs, click on "Retry" and make sure that all other dependent services and masters are up. Often, a retry will work if the previous command failed due to a timeout, network glitch, host goes down and then comes back up, etc. Capture any logs from both the component that failed and the ambari-agent at /var/lib/ambari-agent/data/output-*.txt and /var/lib/ambari-agent/data/errors-*.txt 8. If the failure requires changing configs or restarting a component on a host, then click on the "Pause" button. This will temporarily suspend the Upgrade/Downgrade and allow the user to change configs, execute other commands, such as restarting services, running service checks, etc. Once done, click on the "Resume" button. CAUTION: do not ever add or move hosts, add or delete services, enable High Availability, or change topology while the upgrade is in progress. If cannot Finalize ... 9. Find out the problematic hosts and components. In Ambari 2.0 - 2.2, you can run SELECT repo_version_id, version, display_name FROM repo_version; -- The state for your version may be in UPGRADING, UPGRADED.-- UPGRADING: some component on a host is still not on the newer version -- UPGRADED: all components on all hosts are on the newer version SELECT version, state FROM cluster_version cv JOIN repo_version rv ON cv.repo_version_id = rv.repo_version_id ORDER BY version DESC; -- Find how many hosts are in each state SELECT version, state, COUNT(*) FROM host_version hv JOIN repo_version rv ON hv.repo_version_id = rv.repo_version_id GROUP BY version, state ORDER BY version DESC, state; -- Find components on hosts still not on the newer version SELECT service_name, component_name, version, host_name FROM hostcomponentstate hcs JOIN hosts h ON hcs.host_id = h.host_id WHERE service_name NOT IN ('AMBARI_METRICS', 'KERBEROS') and component_name NOT IN ('ZKFC') ORDER BY version, service_name, component_name, host_name; On these hosts, run the following, 1. hdp-select set all <new_version> 2. Restart any components still on the older version (you may have to click on the "Pause" button first). Once all hosts are on the newer version, then the Cluster Version status should transition to UPGRADED; this will allow you to Finalize the upgrade. 10. If you still run into problems, gather all of the logs, result of the SQL queries, and either email Hortonworks Support or the mailing list of the component it failed on. Here's another useful query. Postgres: SELECT u.upgrade_id, u.direction, u.from_version, u.to_version, hrc.request_id, hrc.task_id, substr(g.group_title, 0, 30), substr(i.item_text, 0, 30), hrc.status FROM upgrade_group g JOIN upgrade u ON g.upgrade_id = u.upgrade_id JOIN upgrade_item i ON i.upgrade_group_id = g.upgrade_group_id JOIN host_role_command hrc ON hrc.stage_id = i.stage_id AND hrc.request_id = u.request_id ORDER BY hrc.task_id; MySQL: SELECT u.upgrade_id, u.direction, u.from_version, u.to_version, hrc.request_id, hrc.task_id, left(g.group_title, 30), left(i.item_text, 30), hrc.status FROM upgrade_group g JOIN upgrade u ON g.upgrade_id = u.upgrade_id JOIN upgrade_item i ON i.upgrade_group_id = g.upgrade_group_id JOIN host_role_command hrc ON hrc.stage_id = i.stage_id AND hrc.request_id = u.request_id ORDER BY hrc.task_id; Have fun upgrading.

pacosoplas · ‎04-12-2016

Hi: I need to know if i can delete from database the extra row, also the skip_sc_failures column with value =1 is error or ok?? ambari=> select * from upgrade; upgrade_id | cluster_id | request_id | from_version | to_version | direction | downgrade_allowed | skip_failures | skip_sc_failures | upgrade_pac kage | upgrade_type ------------+------------+------------+--------------+--------------+-----------+-------------------+---------------+------------------+--------------- ---------+-------------- 1 | 2 | 820 | 2.3.2.0-2950 | 2.4.0.0-169 | UPGRADE | 1 | 1 | 0 | nonrolling-upg rade-2.4 | NON_ROLLING 2 | 2 | 821 | 2.3.2.0-2950 | 2.3.2.0-2950 | DOWNGRADE | 1 | 0 | 0 | nonrolling-upg rade-2.4 | NON_ROLLING 3 | 2 | 894 | 2.3.2.0-2950 | 2.4.0.0-169 | UPGRADE | 1 | 1 | 1 | nonrolling-upg rade-2.4 | NON_ROLLING

sundararajan_sr · ‎03-24-2016

This is resolved. Possible Cause The main problem was the oozie not finding "/etc/tomcat/conf/ssl/server.xml". The oozie server has it own app-server; it should not therefore refer / conflict with the tomcat app server, which have deployed for our own purpose. setting CATALINA_BASE=${CATALINA_BASE:-/usr/hdp/current/oozie-server/oozie-server} setting CATALINA_TMPDIR=${CATALINA_TMPDIR:-/var/tmp/oozie} setting OOZIE_CATALINA_HOME=/usr/lib/bigtop-tomcat It did however refer to /etc/tomcat. We had configurations settings at .bashrc, /etc/profile and /etc/init.d/tomcat re-Catalina Base and Catalina_Home. The oozie-setup.sh has references to Catalina_Base in many places. This may be the reason why it was referring to the wrong path. Solution: Code walk through on the shell files of oozie and other services, which did not start. Commented references to Catalina_Home and Catalina_Base in /etc/profile and etc/init/d/tomcat. Impact: All hadoop services have started Caution Users who may want to run Tomcat app server on the same server as Hadoop could create conflict if configurations for tomcat app server is set in the /etc/profile and etc/init.d/tomcat. The app server may either need to be run on a separate server than on the same server as oozie or enable user specific permission only through .bashrc.

afernandez · ‎03-31-2016

Ambari doesn't support that yet. We have a Jira for Ambari 3.0.0 https://issues.apache.org/jira/browse/AMBARI-14714 It will allow you to have multiple instances of the same service, and potentially at different stack versions, e.g., Spark 1.6.1, 1.7.0, etc.

Matthew_Chang-K · ‎02-15-2017

How do you do this? I have no idea how to delete the entry in the ambari database

Online	Offline
Last Visited	‎07-31-2017 09:08 PM

Member Since	‎09-29-2015 01:22 AM
Last Visited	‎07-31-2017 09:08 PM
Posts	63
Kudos received	107

Cloudera Community

Re: devops tools used in Ambari

Re: Ambari Blueprint doesn't configure the SmartSe...

Re: How to remove an old HDP version

Re: Is there a collaborative document for service ...

Re: File "/var/lib/ambari-agent/cache/stacks/HDP/2...

Ambari Upgrade to 2.4/2.5 fails if ambari-agent is...

Optimize Ambari Performance for Large Clusters

Re: Ambari Blueprint doesn't configure the SmartSe...

Re: Service 'userhome' check failed: File does no...

Re: resource_management.core.exceptions.Fail: Exec...

Ambari - Troubleshooting a Rolling or Express Upgr...

Re: "Upgrade" table missing from Ambari database

Re: Oozie conflicts with existing Tomcat installat...

Re: Does ambari allow multipule instances of zooke...

Re: ambari user management view missing