About sshimpi

sshimpi · ‎01-03-2017

@Indrek Mäestu Can you try rnabling developer tools from browser and check the url its hitting/hang asking for? Also i do not see attachment. Can you please verify once.

sshimpi · ‎12-28-2016

SYMPTOM: Ambari upgrade command gives error as Error Code: 1005. Can't create table 'ambaridb.#sql-4168_34d2' (errno: 150) ERROR: CREATE TABLE blueprint_setting (id BIGINT NOT NULL, blueprint_name VARCHAR(255) NOT NULL, setting_name VARCHAR(255) NOT NULL, setting_data LONGTEXT NOT NULL) 03 Oct 2016 09:30:06,326 INFO [main] DBAccessorImpl:824 - Executing query: ALTER TABLE blueprint_setting ADD CONSTRAINT PK_blueprint_setting PRIMARY KEY (id) 03 Oct 2016 09:30:06,388 INFO [main] DBAccessorImpl:824 - Executing query: ALTER TABLE blueprint_setting ADD CONSTRAINT UQ_blueprint_setting_name UNIQUE (blueprint_name, setting_name) 03 Oct 2016 09:30:06,489 INFO [main] DBAccessorImpl:824 - Executing query: ALTER TABLE blueprint_setting ADD CONSTRAINT FK_blueprint_setting_name FOREIGN KEY (blueprint_name) REFERENCES blueprint (blueprint_name) 03 Oct 2016 09:30:06,545 ERROR [main] DBAccessorImpl:830 - Error executing query: ALTER TABLE blueprint_setting ADD CONSTRAINT FK_blueprint_setting_name FOREIGN KEY (blueprint_name) REFERENCES blueprint (blueprint_name) java.sql.SQLException: Can't create table 'ambaridb.#sql-4168_34e3' (errno: 150) SHOW ENGINE INNODB STATUS; command shows below details, ------------------------ LATEST FOREIGN KEY ERROR ------------------------ 161003 9:30:06 Error in foreign key constraint of table ambaridb/#sql-4168_34e3: FOREIGN KEY (blueprint_name) REFERENCES blueprint (blueprint_name): Cannot find an index in the referenced table where the referenced columns appear as the first columns, or column types in the table and the referenced table do not match for constraint. Note that the internal storage type of ENUM and SET changed in tables created with >= InnoDB-4.1.12, and such columns in old tables cannot be referenced by such columns in new tables. See http://dev.mysql.com/doc/refman/5.1/en/innodb-foreign-key-constraints.html for correct foreign key definition. ROOT CAUSE: === The issue was related to a mismatch on the character sets of the tables in MySQL. When Ambari initially created the database it set the tables to use UTF8. During the upgrade, new tables were created via the UpgradeCatalog classes set to use the LATIN1 character set. More specifically, the blueprint was created setting the character set to UTF8 and the blueprint_setting table was created using the LATIN1 character set. ---- Below was seen using a query like: SELECT character_set_name FROM information_schema.`COLUMNS` WHERE table_schema = "ambari" AND table_name = "blueprint"; SELECT character_set_name FROM information_schema.`COLUMNS` WHERE table_schema = "ambari" AND table_name = "blueprint_setting"; RESOLUTION: This was fixed by dropping the blueprint_setting table and manually creating table using below syntax - [using - CHARACTER SET as utf8] CREATE TABLE blueprint_setting ( id BIGINT NOT NULL, blueprint_name VARCHAR(100) NOT NULL, setting_name VARCHAR(100) NOT NULL, setting_data MEDIUMTEXT NOT NULL, CONSTRAINT PK_blueprint_setting PRIMARY KEY (id), CONSTRAINT UQ_blueprint_setting_name UNIQUE(blueprint_name,setting_name), CONSTRAINT FK_blueprint_setting_name FOREIGN KEY (blueprint_name) REFERENCES blueprint(blueprint_name)) CHARACTER SET utf8;

sshimpi · ‎12-28-2016

ISSUE: While upgrading HDP, Ranger service failed. Below was the error - Failed to apply patch 020-datamask-policy.sql” with error “Not able to drop table ‘x_datamast_type_’def” Foreign constraints fails Error Code 1217 ERROR: ROOT CAUSE: Issue was with foreign_key_checks RESOLUTION: “SET foreign_key_checks = 0;” in mysql ranger DB and dropped the table manually and re-created and again “SET foreign_key_checks = 1;”

sshimpi · ‎12-28-2016

Issue: While performing HDP downgrade the last "Finalize Downgrade" step went successfully but 'Downgrade in Progress' is still struck on 99%. Please find the screenshot below - ERROR: ROOT CAUSE: There are few task which are in PENDING state from table host_role_command Below is sample output - SELECT task_id, status, event, host_id, role, role_command, command_detail, custom_command_name FROM host_role_command WHERE request_id = 858 AND status != 'COMPLETED' ORDER BY task_id DESC 8964, PENDING, 4, KAFKA_BROKER, CUSTOM_COMMAND, RESTART KAFKA/KAFKA_BROKER, RESTART 8897, PENDING, 4, KAFKA_BROKER, CUSTOM_COMMAND, STOP KAFKA/KAFKA_BROKER, STOP RESOLUTION: We need to manually move the task to COMPLETED state UPDATE host_role_command SET status = 'COMPLETED' WHERE request_id = 858 AND status = 'PENDING'

sshimpi · ‎12-28-2016

Issue: When user login no service tabs/action button were displayed in Ambari UI. The Ambari UI was displaying below notification on dashboard - "Move Master Wizard In Progress" ERROR: ROOT CAUSE: Seems user who has admin access to Ambari UI was doing some operation which was left open. The user is no more currently online. This was related to hortonworks Internal Jira -https://hortonworks.jira.com/browse/EAR-4843 RESOLUTION: To get passed this problem we run, and set the UserName of the json to admin then when login as admin we were able to close the wizard and solved the problem. curl -u admin -i -H 'X-Requested-By: ambari' -X POST -d '{"wizard-data":" {\"userName\":\"admin\",\"controllerName\":\"<Controller_Name>\"} "}' http://<ambari-host>:8080/api/v1/persist Below was command I used - curl -u admin -i -H 'X-Requested-By: ambari' -X POST -d '{"wizard-data":" {\"userName\":\"admin\",\"controllerName\":\"moveMasterWizard\"} "}' http://localhost:8080/api/v1/persist

sshimpi · ‎12-28-2016

SYMPTOM: Not able to login to Ambari UI. The Ambari ui was hung. When checked in logs we found below error - ERROR: +++++++ 16 Sep 2016 13:41:21,377 WARN [qtp-client-10408] ObjectGraphWalker:209 - The configured limit of 1,000 object references was reached while attempting to calculate the size of the object graph. Severe performance degradation could occur if the sizing operation continues. This can be avoided by setting the CacheManger or Cache <sizeOfPolicy> elements maxDepthExceededBehavior to "abort" or adding stop points with @IgnoreSizeOf annotations. If performance degradation is NOT an issue at the configured limit, raise the limit value using the CacheManager or Cache <sizeOfPolicy> elements maxDepth attribute. For more information, see the Ehcache configuration documentation. +++++++ ROOT CAUSE: As per above logs we found that the issue is with Ambari cache. This is related to JIRA - https://issues.apache.org/jira/browse/AMBARI-13517 RESOLUTION: We have two ways to resolve the issue - ==> Disable cache by adding below property in "/etc/ambari-server/conf/ambari.properties" file, and restart the ambari server. server.timeline.metrics.cache.disabled=true $ambari-server restart ==> Increase Ambari server Heap Size

sshimpi · ‎12-28-2016

@Ashnee Sharma There are no drawback apart from exposing script to public. Just make sure you do not specify password as plain text in script.

sshimpi · ‎12-27-2016

SYMPTOM: Knox can get LDAP user but can't find related groups. Our LDAP is an openldap (REDHAT). The membership attribute is defined in groups with "uniquemember" ERROR: 2016-05-09 14:42:01,229 INFO hadoop.gateway (KnoxLdapRealm.java:getUserDn(556)) - Computed userDn: uid=a196011,ou=people,dc=hadoop,dc=apache,dc=org using dnTemplate for principal: a196011 2016-05-09 14:42:01,230 INFO hadoop.gateway (KnoxLdapRealm.java:doGetAuthenticationInfo(180)) - Could not login: org.apache.shiro.authc.UsernamePasswordToken - a196xxx, rememberMe=false (10.xxx.xx.64) 2016-05-09 14:42:01,230 DEBUG hadoop.gateway (KnoxLdapRealm.java:doGetAuthenticationInfo(181)) - Failed to Authenticate with LDAP server: {1} org.apache.shiro.authc.AuthenticationException: LDAP naming error while attempting to authenticate user. at org.apache.shiro.realm.ldap.JndiLdapRealm.doGetAuthenticationInfo(JndiLdapRealm.java:303) The above initial error was wrt ldap misconfiguration. Correcting ldap configuration below was the error - "operation not supported in Standby mode" 2016-04-29 23:59:08,389 ERROR provider.BaseAuditHandler (BaseAuditHandler.java:logError(329)) - Error writing to log file. java.lang.IllegalArgumentException: java.net.UnknownHostException: bigre7clu at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:406) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:311) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371) at org.apache.ranger.audit.destination.HDFSAuditDestination.getLogFileStream(HDFSAuditDestination.java:221) at org.apache.ranger.audit.destination.HDFSAuditDestination.logJSON(HDFSAuditDestination.java:123) at org.apache.ranger.audit.queue.AuditFileSpool.sendEvent(AuditFileSpool.java:890) at org.apache.ranger.audit.queue.AuditFileSpool.runDoAs(AuditFileSpool.java:838) at org.apache.ranger.audit.queue.AuditFileSpool$2.run(AuditFileSpool.java:759) at org.apache.ranger.audit.queue.AuditFileSpool$2.run(AuditFileSpool.java:757) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637) at org.apache.ranger.audit.queue.AuditFileSpool.run(AuditFileSpool.java:765) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.UnknownHostException: bigre7clu ROOT CAUSE: Found that customer was having namenode HA and Knox was not configured with Namenode HA. RESOLUTION: Configured Knox with HA for webhdfs which resolved the issue. <provider> <role>ha</role> <name>HaProvider</name> <enabled>true</enabled> <param> <name>WEBHDFS</name> <value>maxFailoverAttempts=3;failoverSleep=1000;maxRetryAttempts=300;retrySleep=1000;enabled=true</value> </param> </provider> <service> <role>WEBHDFS</role> <url>http://{host1}:50070/webhdfs</url> <url>http://{host2}:50070/webhdfs</url> </service>

sshimpi · ‎12-26-2016

@ALFRED CHAN The exams are still there. It seems you need to select correct region in AWS to get the AMI. Can you let me know which is the region you are able to see in AWS UI ? PLease see the screenshot below - I am in "N. Virginia" region and able to see the ami.

sshimpi · ‎12-26-2016

PROBLEM STATEMENT: After finishing upgrade of all Hadoop components, I issued su - hdfs -c "hdfs dfsadmin -finalizeUpgrade" on active Namenode. However, NN UI shows all is good, until I failover to standby and standby becomes active - then again I got message that I should finalize my upgrade. ERROR: 2015-10-27 18:16:04,954 - Task. Type: EXECUTE, Script: scripts/namenode.py - Function: prepare_non_rolling_upgrade 2015-10-27 18:16:05,044 - Preparing the NameNodes for a NonRolling (aka Express) Upgrade. 2015-10-27 18:16:05,045 - checked_call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -s '"'"'http://ambari.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpKcaPrw 2>/tmp/tmpPjCQIr''] {'quiet': False} 2015-10-27 18:16:05,068 - checked_call returned (0, '') 2015-10-27 18:16:05,069 - checked_call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -s '"'"'http://ambari.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpPIEyeS 2>/tmp/tmpOOXCrc''] {'quiet': False} 2015-10-27 18:16:05,092 - checked_call returned (0, '') 2015-10-27 18:16:05,093 - NameNode High Availability is enabled and this is the Active NameNode. 2015-10-27 18:16:05,093 - Enter SafeMode if not already in it. 2015-10-27 18:16:05,094 - Checkpoint the current namespace. 2015-10-27 18:16:05,094 - Backup the NameNode name directory's CURRENT folder. 2015-10-27 18:16:05,100 - Execute[('cp', '-ar', '/hadoop/hdfs/namenode/current', '/tmp/upgrades/2.2/namenode_ida8c06540_date162715_1/')] {'sudo': True} 2015-10-27 18:16:05,113 - Attempt to Finalize if there are any in-progress upgrades. This will return 255 if no upgrades are in progress. 2015-10-27 18:16:05,113 - checked_call['/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -rollingUpgrade finalize'] {'logoutput': True, 'user': 'hdfs'} FINALIZE rolling upgrade ... There is no rolling upgrade in progress or rolling upgrade has already been finalized. 2015-10-27 18:16:08,213 - checked_call returned (0, 'FINALIZE rolling upgrade ...\nThere is no rolling upgrade in progress or rolling upgrade has already been finalized.') ROOT CAUSE: This is a BUG in Ambari - https://hortonworks.jira.com/browse/BUG-46853 [Hortonworks internal BUG URL for reference] RESOLUTION: The above bug has taken care in Ambari2.2.0 version. Upgrading to Ambari2.2.0 will resolved the issue.

Online	Offline
Last Visited	‎12-07-2017 08:26 AM

Member Since	‎02-08-2016 09:06 AM
Last Visited	‎12-07-2017 08:26 AM
Posts	793
Kudos received	667

Cloudera Community

Re: Issue with Ranger User/group sync

Re: Ranger HDFS test connection fails

Re: Error while configuring NameNode High Availabi...

Re: Ranger policies on HDFS

Re: Can we do column value level restriction in Ap...

Re: Ambari admin stack versions hangs

Ambari upgrade command gives error as Error Code: ...

Ranger issue while HDP upgrade

HDP Downgrade is Hung in Ambari UI.

Ambari - No Service Action button for services.

Ambari Console Hung

Re: Can we automatically sync ldap users into amba...

Knox can get LDAP user but can't find related grou...

Re: HDPCD developer practice exam AMI missing in A...

HDFS finalize upgrade issue