About sampathkumar_ma

sampathkumar_ma · ‎11-21-2018

Hello, We could see the below alert in Ambari UI. Please let me know how to change/reset the password for the HDP service users. Connection failed to http://xxxxxx.xxxxx.com:50070 (Execution of '/usr/bin/kinit -c /var/lib/ambari-agent/tmp/curl_krb_cache/web_alert_ambari-qa_cc_0968abaa5827638ae5a9a92642ebb363a83845d449ecfb223d4f91e3 -kt /etc/security/keytabs/spnego.service.keytab HTTP/xxxx> /dev/null' returned 1. You are required to change your password immediately (password aged) (current) UNIX password: su: Authentication token manipulation error Changing password for ambari-qa.) When I tried to do $su hdfs from root, it's asking me to change the password $su hdfs You are required to change your password immediately (password aged) Changing password for hdfs. (current) UNIX password: $chage -l hdfs Last password change : Oct 16, 2018 Password expires : Nov 20, 2018 Password inactive : Feb 18, 2019 Account expires : never Minimum number of days between password change : 1 Maximum number of days between password change : 35 Number of days of warning before password expires : 6<br> Please help ASAP.

sampathkumar_ma · ‎10-26-2018

@ANSARI FAHEEM AHMED Thanks for your inputs. Yes, I enabled the NameNode HA in our cluster. It's successful without any issues. 🙂

sampathkumar_ma · ‎10-26-2018

Hi, Is there any document to configure NameNode HA where kerberos is enabled? Since, kerberos is already enabled in our cluster, is there any specific parameters/configuration required before enabling the NameNode HA from Ambari. Please suggest. Thank you.

sampathkumar_ma · ‎10-16-2018

I'm trying to enable kerberos with existing AD and getting below error message. WARN [ambari-client-thread-31] ADKerberosOperationHandler:470 - Failed to communicate with the Active Directory at ldaps://xldapxxx.xxx.com: simple bind failed: xldapxxx.xxx.com:636 javax.naming.CommunicationException: simple bind failed: vxldapxxx.xxx.com:636 [Root exception is javax.net.ssl.SSLException: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty] ERROR [ambari-client-thread-31] KerberosHelperImpl:2232 - Cannot validate credentials: org.apache.ambari.server.serveraction.kerberos.KerberosInvalidConfigurationException: Failed to connect to KDC - Failed to communicate with the Active Directory at ldaps://xldapxxx.xxx.com: simple bind failed: xldapxxx.xxx.com:636 Update the KDC settings in krb5-conf and kerberos-env configurations to correct this issue. 16 Oct 2018 10:53:52,542 ERROR [ambari-client-thread-31] BaseManagementHandler:67 - Bad request received: Failed to connect to KDC - Failed to communicate with the Active Directory at ldaps://xldapxxx.xxx.com: simple bind failed: xldapxxx.xxx.com:636 Update the KDC settings in krb5-conf and kerberos-env configurations to correct this issue ldap server is reachable from hadoop kdc server. $telnet xldapxxx.xxx.com 636 Trying 1x.1xx.1xx.xx1... Connected to xldapxxx.xxx.com. Escape character is '^]'. In Ambari, I'm try to connect to existing AD using below parameters KDC host: seswcxxxd011.xxx.com --> host where krb5-server is installed(KDC host) Realm name: HADOOP.xxxx.xxx.COM LDAP url: ldaps://xldapxxx.xxx.com Container DN: OU=Users,OU=xxx,DC=xx,DC=com Test connecton : successfull Kadmin host: seswcxxxd011.xxx.com ---> host where krb5-server is installed(KDC host) Admin principal: admin/admin@HADOOP.xxxx.xxx.COM Admin password: *********** I have created kerberos database(kdb5_util -r ) & krbtgt principle with master password in Krb5-server host. $kadmin.local Authenticating as principal root/admin@HADOOP.xxxx.xxxx.COM with password. kadmin.local: listprincs K/M@HADOOP.xxxx.xxxx.COM ---> admin/admin@HADOOP.xxxx.xxxx.COM ambari/admin@HADOOP.xxxx.xxxx.COM kadmin/admin@HADOOP.xxxx.xxxx.COM kadmin/changepw@HADOOP.xxxx.xxxx.COM kadmin/seswcxxxd011.xxx.com@HADOOP.xxxx.xxxx.COM kiprop/seswcxxxd011.xxx.com@HADOOP.xxxx.xxxx.COM krbtgt/HADOOP.xxxx.xxxx.COM@xxx.COM ---> AD master password root/admin@HADOOP.xxxx.xxxx.COM Anyone can help me to resolve above issue? Thanks.

sampathkumar_ma · ‎05-24-2018

Hi, We have enabled the NN HA by appending the parameters in hdfs-site.xml and core-site.xml by using config.py. But active and standby namenode state is not showing up in ambari UI. Please find the details below. Parameters: hdfs-site.xml: "ha.zookeeper.quorum":"master.wpm.com:2181,wpm111.wpm.com:2181,wpm333.wpm.com:2181" "dfs.ha.automatic-failover.enabled":"true" "dfs.ha.fencing.methods":"shell(/bin/true)" "dfs.client.failover.proxy.provider.mycluster":"org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" "dfs.namenode.shared.edits.dir":"qjournal://master.wpm.com:8485;wpm111.wpm.com:8485;wpm333.wpm.com:8485/mycluster" "dfs.namenode.http-address.mycluster.nn2":"wpm111.wpm.com:50070" "dfs.namenode.http-address.mycluster.nn1":"master.wpm.com:50070" "dfs.namenode.rpc-address.mycluster.nn2":"wpm111.wpm.com:8020" "dfs.namenode.rpc-address.mycluster.nn1":"master.wpm.com:8020" "dfs.namenode.https-address.mycluster.nn2": "wpm111.wpm.com:50470", "dfs.namenode.https-address.mycluster.nn1": "master.wpm.com:50470" "dfs.ha.namenodes.mycluster":"nn1,nn2" "dfs.nameservices":"mycluster" core-site.xml "fs.defaultFS" : "hdfs://mycluster/" Service State: [hdfs@WPM0 0]$ hdfs haadmin -getServiceState nn1 standby [hdfs@WPM0 0]$ hdfs haadmin -getServiceState nn2 active Please find the ambari UI, Namenode UI and custom hdfs-site.xml screenshot here I could see a few warning and errors in ambari-server log. 24 May 2018 21:08:08,401 WARN [ambari-client-thread-71] Errors:173 - The following warnings have been detected with resource and/or provider classes: WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.ambari.server.api.services.TaskService.getComponents(java.lang.String,javax.ws.rs.core.HttpHeaders,javax.ws.rs.core.UriInfo), should not consume any entity. WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.ambari.server.api.services.TaskService.getTask(java.lang.String,javax.ws.rs.core.HttpHeaders,javax.ws.rs.core.UriInfo,java.lang.String), should not consume any entity. 24 May 2018 21:08:27,654 WARN [ambari-client-thread-71] Errors:173 - The following warnings have been detected with resource and/or provider classes: WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.ambari.server.api.services.WidgetService.getServices(java.lang.String,javax.ws.rs.core.HttpHeaders,javax.ws.rs.core.UriInfo), should not consume any entity. WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.ambari.server.api.services.WidgetService.getService(java.lang.String,javax.ws.rs.core.HttpHeaders,javax.ws.rs.core.UriInfo,java.lang.String), should not consume any entity. 24 May 2018 21:08:27,785 WARN [ambari-client-thread-71] Errors:173 - The following warnings have been detected with resource and/or provider classes: WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.ambari.server.api.services.WidgetLayoutService.getServices(java.lang.String,javax.ws.rs.core.HttpHeaders,javax.ws.rs.core.UriInfo), should not consume any entity. WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.ambari.server.api.services.WidgetLayoutService.getService(java.lang.String,javax.ws.rs.core.HttpHeaders,javax.ws.rs.core.UriInfo,java.lang.String), should not consume any entity. 24 May 2018 21:08:29,972 INFO [ambari-client-thread-71] StackAdvisorRunner:47 - Script=/var/lib/ambari-server/resources/scripts/stack_advisor.py, actionDirectory=/var/run/ambari-server/stack-recommendations/3, command=recommend-configurations 24 May 2018 21:08:29,976 INFO [ambari-client-thread-71] StackAdvisorRunner:61 - Stack-advisor output=/var/run/ambari-server/stack-recommendations/3/stackadvisor.out, error=/var/run/ambari-server/stack-recommendations/3/stackadvisor.err Could you please help on this. Thanks,

sampathkumar_ma · ‎04-19-2018

I am running a few spark jobs which are scheduled in oozie workflow, one of the job is failing with below error [main] INFO org.apache.spark.deploy.yarn.Client - client token: N/A diagnostics: Application application_1523897345683_2170 failed 2 times due to AM Container for appattempt_1523897345683_2170_000004 exited with exitCode: 1 For more detailed output, check the application tracking page: http://<master_ip>:8088/cluster/app/application_1523897345683_2170 Then click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_e13_1523897345683_2170_04_000001 Exit code: 1 Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://<master_ip>:8020/user/hdfs/.sparkStaging/application_1523897345683_2170/__spark_conf__.zip at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1446) at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1438) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1454) at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$7.apply(ApplicationMaster.scala:177) at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$7.apply(ApplicationMaster.scala:174) at scala.Option.foreach(Option.scala:257) at org.apache.spark.deploy.yarn.ApplicationMaster.<init>(ApplicationMaster.scala:174) at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:767) at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67) at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66) at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:766) at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala) Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: queue_one start time: 1524129738475 final status: FAILED tracking URL: http://<master_ip>:8088/cluster/app/application_1523897345683_2170 user: hdfs <<< Invocation of Spark command completed <<< Hadoop Job IDs executed by Spark: job_1523897345683_2170 Could you please help on this. Thank you. Regards Sampath

sampathkumar_ma · ‎03-15-2018

Thanks for your inputs.

sampathkumar_ma · ‎02-10-2018

I am running a cluster with 2 nodes where master & worker having below configuration. Master : 8 Cores, 16GB RAM Worker : 16 Cores, 64GB RAM YARN configuration: yarn.scheduler.minimum-allocation-mb: 1024 yarn.scheduler.maximum-allocation-mb: 22145 yarn.nodemanager.resource.cpu-vcores : 6 yarn.nodemanager.resource.memory-mb: 25145 Capacity Scheduler: yarn.scheduler.capacity.default.minimum-user-limit-percent=100 yarn.scheduler.capacity.maximum-am-resource-percent=0.5 yarn.scheduler.capacity.maximum-applications=100 yarn.scheduler.capacity.node-locality-delay=40 yarn.scheduler.capacity.root.accessible-node-labels=* yarn.scheduler.capacity.root.acl_administer_queue=* yarn.scheduler.capacity.root.capacity=100 yarn.scheduler.capacity.root.default.acl_administer_jobs=* yarn.scheduler.capacity.root.default.acl_submit_applications=* yarn.scheduler.capacity.root.default.capacity=100 yarn.scheduler.capacity.root.default.maximum-capacity=100 yarn.scheduler.capacity.root.default.state=RUNNING yarn.scheduler.capacity.root.default.user-limit-factor=1 yarn.scheduler.capacity.root.queues=default We have 23 spark jobs(scheduled in oozie)running on YARN at every hour. Some jobs are taking more time to complete. I am not sure whether YARN memory + vcores allocation is done properly or not. Please suggest me the recommended YARN memory, vcores & Scheduler configuration based on the number of cores + RAM availablity. Thanks, Sampath

sampathkumar_ma · ‎11-26-2017

I have created HDP & HDP-UTILS-1.1.0.21 internal repository map as below: curl -H "X-Requested-By: ambari" -X PUT -u admin:admin http://ambari-server-hostname:8080/api/v1/stacks/HDP/versions/2.6/operating_systems/redhat7/repositories/HDP-2.6 -d @repo.json payload: { "Repositories" : { "base_url" : "http://ip-address/repo/HDP/centos7/2.6.3.0-235", "verify_base_url" : true } } curl -H "X-Requested-By: ambari" -X PUT -u admin:admin http://<ambari-server-hostname>:8080/api/v1/stacks/HDP/versions/2.6/operating_systems/redhat7/repositories/HDP-UTILS-1.1.0.21 -d @hdputils-repo.json payload: { "Repositories" : { "base_url" : "http://ip-address/repo/HDP_UTILS", "verify_base_url" : true } } but during installation, public repository being invoked by components and not using local repository which I have registered using rest API. 2017-11-26 17:04:05,975 - File['/etc/yum.repos.d/ambari-hdp-1.repo'] {'content': '[HDP-2.6-repo-1]\nname=HDP-2.6-repo-1\nbaseurl=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.0.3\n\npath=/\nenabled=1\ngpgcheck=0\n[HDP-UTILS-1.1.0.21-repo-1]\nname=HDP-UTILS-1.1.0.21-repo-1\nbaseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.21/repos/centos7\n\npath=/\nenabled=1\ngpgcheck=0'} Please find the attached screenshot.component-installation-fail.png Stack version: HDP 2.6 & Ambari 2.6.0 Could you please help on this. Thank you.

sampathkumar_ma · ‎11-10-2017

@kgautam : Thanks for your inputs. fuser /hadoop/yarn/local/registeredExecutors.ldb/LOCK did not help. It is not showing any PID associated with LOCK.

Online	Offline
Last Visited	‎09-21-2024 04:06 AM

Member Since	‎09-10-2016 09:05 AM
Last Visited	‎09-21-2024 04:06 AM
Posts	82
Kudos received	6

Cloudera Community

Re: Spark2 history server UI not opening in HDP 3....

Re: HDP 3.0 with local repository - failing to dep...

Re: Additional ranger Keyadmins

Re: Hbase active and standby shutting down automat...

Re: Hive service check is failing

How to change HDP service users password

Re: Enabling NameNode HA in Kerberized cluster

Enabling NameNode HA in Kerberized cluster

Issue with enable kerberos with existing AD KDC

NameNode Active and Standby state is not showing u...

YARN: AM Container exited with exitCode: 1

Re: Yarn memory allocation & utilization

Yarn memory allocation & utilization

Issue with HDP installation using Ambari blueprint...

Re: IO error: lock /hadoop/yarn/local/registeredEx...