Member since
06-29-2016
15
Posts
51
Kudos Received
0
Solutions
06-30-2017
04:38 AM
5 Kudos
PROBLEM: Changed Ambari from root to non-root user and hiveview doesn't work. Ambari hive view fails with Usernames not matched error.
ERROR:
Service 'userhome' check failed:
java.io.IOException: Usernames not matched: name=ambari-server-<cluster_name> != expected=<custom_user>
at sun.reflect.GeneratedConstructorAccessor243.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) ROOT CAUSE: Configuration issue on Ambari Hive view definition as username doesn't match. SOLUTION: Navigate to Hive and YARN Configs in Ambari UI and change as below and restart respective services.
A) Custom webhcat-site
webhcat.proxyuser.<AMBARI_SERVER_PRINCIPAL_USER>.groups=*
webhcat.proxyuser.<AMBARI_SERVER_PRINCIPAL_USER>.hosts=*
B) Custom yarn-site
yarn.timeline-service.http-authentication.<AMBARI_SERVER_PRINCIPAL_USER>.groups=*
yarn.timeline-service.http-authentication.<AMBARI_SERVER_PRINCIPAL_USER>.hosts=*
yarn.timeline-service.http-authentication.<AMBARI_SERVER_PRINCIPAL_USER>.users=*
... View more
Labels:
06-30-2017
04:29 AM
6 Kudos
PROBLEM: Ambari alert with Operation not permitted: '/var/lib/ambari-agent/tmp/curl_krb_cache' error Alert:
ERROR 2017-04-27 20:17:10,287 alert_ha_namenode_health.py:185 - [Alert] NameNode High Availability Health on taco1.example.com fails:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/alerts/alert_ha_namenode_health.py", line 171, in execute
kinit_timer_ms = kinit_timer_ms)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/curl_krb_request.py", line 106, in curl_krb_request
os.chmod(curl_krb_cache_path, 0777)
OSError: [Errno 1] Operation not permitted: '/var/lib/ambari-agent/tmp/curl_krb_cache'
ROOT CAUSE: Ambari running as non-root user and /var/lib/ambari-agent is owned by root. SOLUTION: Move the directory to tmp location or change the ownership of the directory per non-root user. $ mv /var/lib/ambari-agent/tmp /var/lib/ambari-agent/tmp.old
OR
$ chown <non-root-user>:<non-root-user> /var/lib/ambari-agent -R
... View more
Labels:
06-30-2017
04:19 AM
5 Kudos
PROBLEM: Unable to run insert query from Ambari Hive View with Kerberos enabled.
Error: Error in /var/log/ambari-server/hive-next-view/hive-view.log at java.lang.Thread.run(Thread.java:745)
Caused by: java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.tez.TezTask
at org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:348)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:251)
at org.apache.ambari.view.hive2.HiveJdbcConnectionDelegate.execute(HiveJdbcConnectionDelegate.java:49)
at org.apache.ambari.view.hive2.actor.StatementExecutor.runStatement(StatementExecutor.java:87)
at org.apache.ambari.view.hive2.actor.StatementExecutor.handleMessage(StatementExecutor.java:70)
at org.apache.ambari.view.hive2.actor.HiveActor.onReceive(HiveActor.java:38)
at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167) Corresponding Error in RM UI Application_xxxx_xxx Diagnostics: Diagnostics:
Application application_1494540056219_0008 failed 2 times due to AM Container for
appattempt_1494540056219_0008_000002 exited with exitCode: -1000
For more detailed output, check the application tracking page:
http://space3.example.com:8088/cluster/app/application_1494540056219_0008 Then click on links to logs of each attempt.
Diagnostics: Application application_1494540056219_0008 initialization failed (exitCode=255) with output: main :
command provided 0
main : run as user is admin
main : requested yarn user is admin
User admin not foundFailing this attempt.
Failing the application.
ROOT CAUSE: Since Kerberos is enabled and "hive.server2.enable.doAs" property in Hive is true which is "Run as end user instead of Hive user is true", it looks for end user to be present locally in every Node Managers. SOLUTION: Create end user running the hive queries locally or have it present via AD/LDAP.
... View more
Labels:
06-30-2017
04:10 AM
5 Kudos
PROBLEM: Installation fails with different HDP version than desired when registering a node via Ambari. For example: When registering and installing hdp 2.4.3 and some host fails on installation and tries to install hdp 2.5.3 Error: 2017-05-13 20:37:05,278 - Will install packages for repository version 2.4.3.0-227
2017-05-13 20:37:05,278 - Repository['HDP-2.4.3.0-227'] {'append_to_file': False, 'base_url': 'http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.4.3.0/', 'action': ['create'], 'components': ['HDP', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP-2.4.3.0-227', 'mirror_list': None}
2017-05-13 20:37:05,317 - File['/etc/yum.repos.d/HDP-2.4.3.0-227.repo'] {'content': '[HDP-2.4.3.0-227]\nname=HDP-2.4.3.0-227\nbaseurl=http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.4.3.0/\n\npath=/\nenabled=1\ngpgcheck=0'}
2017-05-13 20:37:05,439 - Writing File['/etc/yum.repos.d/HDP-2.4.3.0-227.repo'] because contents don't match
2017-05-13 20:37:05,461 - Repository['HDP-UTILS-2.4.3.0-227'] {'append_to_file': True, 'base_url': 'http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6', 'action': ['create'], 'components': ['HDP-UTILS', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP-2.4.3.0-227', 'mirror_list': None}2017-05-13 20:37:05,486 - File['/etc/yum.repos.d/HDP-2.4.3.0-227.repo'] {'content': '[HDP-2.4.3.0-227]\nname=HDP-2.4.3.0-227\nbaseurl=http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.4.3.0/\n\npath=/\nenabled=1\ngpgcheck=0\n[HDP-UTILS-2.4.3.0-227]\nname=HDP-UTILS-2.4.3.0-227\nbaseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6\n\npath=/\nenabled=1\ngpgcheck=0'}
2017-05-13 20:37:05,560 - Writing File['/etc/yum.repos.d/HDP-2.4.3.0-227.repo'] because contents don't match
2017-05-13 20:37:05,584 - call[('ambari-python-wrap', '/usr/bin/hdp-select', 'versions')] {}
2017-05-13 20:37:05,618 - call returned (0, '2.3.6.0-3796\n2.4.0.0-169\n2.5.3.19-2')
2017-05-13 20:37:05,619 - Package['hdp-select'] {'retry_on_repo_unavailability': False, 'retry_count': 5, 'action': ['upgrade']}
2017-05-13 20:37:05,623 - Installing package hdp-select ('/usr/bin/yum -d 0 -e 0 -y install hdp-select')
2017-05-13 20:37:10,441 - checked_call['rpm -q --queryformat '%{version}-%{release}' hdp-select | sed -e 's/\.el[0-9]//g''] {'stderr': -1}
2017-05-13 20:37:10,485 - checked_call returned (0, '2.5.3.19-2', '')
2017-05-13 20:37:10,486 - Package['hive_2_5_3_19_2'] {'retry_on_repo_unavailability': False, 'retry_count': 5, 'action': ['upgrade']}
ROOT CAUSE: "hdp-select" is of higher version is installed than desired version.This could happen when hosts were used in any other cluster and re-used without proper cleanup.
[root@host3 ~]# rpm -q --queryformat '%{version}-%{release}' hdp-select | sed -e 's/\.el[0-9]//g'2.5.3.19-2
[root@host2 ~]# rpm -q --queryformat '%{version}-%{release}' hdp-select | sed -e 's/\.el[0-9]//g'2.4.3.0-227 SOLUTION: Downgrade "hdp-select" package to desired version according to cluster version.
... View more
05-12-2017
01:59 AM
2 Kudos
PROBLEM : Ambari Workflow Manager view with kerberos fails to clear Oozie Service Check and errors out with proxy error in wfmanager-view.log. ERROR : 11 May 2017 22:34:49,839 INFO [ambari-client-thread-26] [WORKFLOW_MANAGER 1.0.0 MyWFM] OozieDelegate:149 - Proxy request for url: [GET] http://space2.example.com:11000/oozie/v1/admin/configuration
11 May 2017 22:34:49,889 ERROR [ambari-client-thread-26] [WORKFLOW_MANAGER 1.0.0 MyWFM] OozieProxyImpersonator:456 - Error in GET proxyjava.lang.RuntimeException:
java.lang.NullPointerException
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1455)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
at sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:2979)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:489) ROOT CAUSE : Missing below properties in custom oozie-site.xml. oozie.service.ProxyUserService.proxyuser.<AMBARI_SERVER_PRINCIPAL_USER>.hosts=*
oozie.service.ProxyUserService.proxyuser.<AMBARI_SERVER_PRINCIPAL_USER>.groups=* WORKAROUND / RESOLUTION : Update below property in Oozie configs for Custom oozie-site.xml via Ambari UI and make the value to *: oozie.service.ProxyUserService.proxyuser.<AMBARI_SERVER_PRINCIPAL_USER>.hosts=*
oozie.service.ProxyUserService.proxyuser.<AMBARI_SERVER_PRINCIPAL_USER>.groups=*
Please restart the Oozie service and try accessing hive view from standalone view server. Also make sure for Kerberized Ambari and HDP, in Ambari WORKFLOW_MANAGER view definition, you've WebHDFS Authorization as "auth=KERBEROS;proxyuser=<AMBARI_SERVER_PRINCIPAL_USER>" instead of "auth=SIMPLE". More reference can be found HERE.
... View more
Labels:
03-31-2017
08:08 PM
3 Kudos
PROBLEM: All the application_xxx_xx directories under ${yarn.nodemanager.log-dirs} are getting created with 0710. Can this be overridden to make less restrictive?
For example: [root@space2 ~]# su -l hdfs -c 'hadoop jar /usr/hdp/2.5.3.0-37/hadoop-mapreduce/hadoop-mapreduce-examples-2.7.3.2.5.3.0-37.jar randomtextwriter -write -nrFiles 10 -filesize 500'
:
:
Running 30 maps.
Job started: Wed Dec 28 00:22:55 UTC 2016
[root@space2 log]# ls -l /hadoop/yarn/log
total 4
drwx--x---. 12 yarn hadoop 4096 Dec 28 00:24 application_1482346173510_0002
ROOT CAUSE: Currently yarn.nodemanager.default-container-executor.log-dirs.permissions property is hardcoded to permission 710 per https://issues.apache.org/jira/browse/YARN-4579.
RESOLUTION: Fixed in upcoming Hadoop 2.9.
... View more
Labels:
03-31-2017
05:15 PM
4 Kudos
PROBLEM : Accessing Capacity Scheduler View give 500 internal server error message and ambari server logs and Table './ambari/alert_history' is marked as crashed and should be repaired. ERROR : ERROR [ambari-client-thread-35] ContainerResponse:537 - Mapped exception to response: 500 (Internal Server Error) org.apache.ambari.view.capacityscheduler.utils.ServiceFormattedException at org.apache.ambari.view.capacityscheduler.ConfigurationService.readClusterInfo(ConfigurationService.java:162) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
ERROR [ambari-client-thread-137] ReadHandler:102 - Caught a runtime exception executing a queryjavax.persistence.PersistenceException: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.6.2.v20151217-774c696): org.eclipse.persistence.exceptions.DatabaseExceptionInternal Exception: java.sql.SQLException: Table './ambari/alert_history' is marked as crashed and should be repairedError Code: 145
ROOT CAUSE : alert_definition_id present in alert_history table does not exist in alert_definition table.
WORKAROUND / RESOLUTION : 1. Take a dump of Ambari Database. 2. Stop Ambari Server. $ ambari-server stop 3. Login to the Ambari Database and run below queries: a) Run below select queries: > select * from alert_history where alert_definition_id not in (select definition_id from alert_definition);
> select * from alert_current where definition_id not in (select definition_id from alert_definition); b) When there is an non empty set result for above step 3a queries then run below delete queries. > delete from alert_history where alert_definition_id not in (select definition_id from alert_definition);
> delete from alert_current where definition_id not in (select definition_id from alert_definition); 4. Start Ambari Server. $ ambari-server start
... View more
Labels:
03-28-2017
11:13 PM
4 Kudos
NOTE: This is for information and testing purposes only. Pleaes exercise caution prior to using it on business critical environments. STEPS: To completely remove HDP stack from the Operating System, do the following: 1. Stop all the services in Ambari UI. 2. Shutdown ambari-server and ambari-agent on Ambari server host using the following commands: $ ambari-server stop
$ ambari-agent stop 3. Shutdown ambari-agent on all other nodes using the following commands: $ ambari-agent stop 4. Run the following command to check the installed services to delete the related configs and directories: $ hdp-select | grep -v 'None' 5. Follow this URLto uninstall each services that is installed as per the above step or from the Ambari UI. 6. Remove all the files and directory related with Hadoop and HDP using the following commands. This step will delete all the files and directories name with hadoop under root partition. Please be extra caution to run this step and make sure you have nothing important in these directories. find / -name hadoop -type d | xargs rm -rf; rm -rf /usr/hdp
find / -name hadoop-* -type d | xargs rm -rf
7. Based on the services that are installed, delete old users like HDFS, YARN, MAPRED etc, using the following commands: $ userdel -r hdfs
$ userdel -r yarn
$ userdel -r mapred 8. Similarly delete other users like hive, hbase, oozie, etc if these services are installed on the cluster. 9. If you want to reset the Ambari Database to remove the deleted cluster information then do the following: Take a backup of Ambari database to your backup location using the steps mentioned in this document. Reset the Ambari database using the following command from Ambari Server host: $ ambari-server reset
... View more
03-28-2017
06:50 PM
4 Kudos
Note: This is for information and testing purposes only. Please exercise caution prior to using it on business critical environments. Steps to migrate the ranger and ranger_audit database to the new MySQL host without re-installing Ranger Service via Ambari UI - 1. Stop all the services on the cluster via Ambari UI. 2. Make sure Ranger Admin and Ranger usersync is down. 3. Take dump of ranger and ranger_audit databases. $ mysqldump ranger > /path/to/ranger_dump.sql
$ mysqldump ranger_audit > /path/to/ranger_audit_dump.sql
4. Login to new database host and to the database. $ mysql -u root -p
5. Create user rangerdba and provide them all privileges. CREATE USER 'rangerdba'@'localhost' IDENTIFIED BY '<rangerdba_password>';
GRANT ALL PRIVILEGES ON *.* TO 'rangerdba'@'localhost';
CREATE USER 'rangerdba'@'%' IDENTIFIED BY '<rangerdba_password>';
GRANT ALL PRIVILEGES ON *.* TO 'rangerdba'@'%';
GRANT ALL PRIVILEGES ON *.* TO 'rangerdba'@'<new_database_fqdn>';
GRANT ALL PRIVILEGES ON *.* TO 'rangerdba'@'localhost' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'rangerdba'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;
6. Create user rangeradmin and provide them all privileges. CREATE USER 'rangeradmin'@'localhost' IDENTIFIED BY '<rangeradmin_password>';
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'localhost';
CREATE USER 'rangeradmin'@'%' IDENTIFIED BY '<rangeradmin_password>';
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'%';
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'<new_database_fqdn>';
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'localhost' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES; 7. Create user rangerlogger and provide them all privileges. CREATE USER 'rangerlogger'@'localhost' IDENTIFIED BY '<rangerlogger_password>';
GRANT ALL PRIVILEGES ON *.* TO 'rangerlogger'@'localhost';
CREATE USER 'rangerlogger'@'%' IDENTIFIED BY '<rangerlogger_password>';
GRANT ALL PRIVILEGES ON *.* TO 'rangerlogger'@'%';
GRANT ALL PRIVILEGES ON *.* TO 'rangerlogger'@'<new_database_fqdn>';
GRANT ALL PRIVILEGES ON *.* TO 'rangerlogger'@'localhost' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'rangerlogger'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;
8. Create database ranger and ranger_audit in the new database host. CREATE DATABASE ranger;
CREATE DATABASE ranger_audit; 9. Restore the backup from old database host to the new one. $ mysql ranger < /path/to/ranger_dump.sql
$ mysql ranger_audit < /path/to/ranger_audit_dump.sql 10. Navigate to Ranger Config in Ambari UI. 11. Change the Ranger DB host to new database hostname. This should update the JDBC connect string for a Ranger database and JDBC connect string for root user with new database hostname. 12. Click on Test Connection. It should be successful. 13. Restart the Ranger service via Ambari and then all other services and validate Ranger Admin UI.
... View more
Labels:
03-28-2017
04:49 PM
6 Kudos
PROBLEM : Hive view errors out with null pointer exception from Stand alone ambari view server.
ERROR : java.lang.NullPointerException
at org.apache.ambari.view.hive2.resources.jobs.atsJobs.ATSParser.getHiveQueryIdFromJson(ATSParser.java:113)
at org.apache.ambari.view.hive2.resources.jobs.atsJobs.ATSParser.getHiveQueryIdByOperationId(ATSParser.java:107)
at org.apache.ambari.view.hive2.resources.jobs.Aggregator.readATSJob(Aggregator.java:260)
at org.apache.ambari.view.hive2.resources.jobs.JobService.jsonObjectFromJob(JobService.java:161)
at org.apache.ambari.view.hive2.resources.jobs.JobService.getOne(JobService.java:145)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
ROOT CAUSE : Stand alone ambari host name missing in yarn.timeline-service.http-authentication.proxyuser.ambari-server.
WORKAROUND / RESOLUTION : Update below property in YARN configs and make the value to *: yarn.timeline-service.http-authentication.proxyuser.ambari-server-<CLUSTER_NAME>.hosts=*
yarn.timeline-service.http-authentication.proxyuser.ambari-server-<CLUSTER_NAME>.users=*
yarn.timeline-service.http-authentication.proxyuser.ambari-server-<CLUSTER_NAME>.groups=* Please restart the YARN service and try accessing hive view from standalone view server.
... View more
Labels: