About jagadeesan

jagadeesan · ‎01-01-2019

@Suraj Singh Actually particular fix is released in following version 3.1.0, 2.10.0, 2.9.1, 3.0.1 . Related JIRA https://issues.apache.org/jira/browse/YARN-7873. Fix you read complete comments then you will get idea about why they revert YARN-6078 and not released on 3.0.0.

jagadeesan · ‎12-28-2018

@Slimani Ibrahim It's not recommended to format the NameNode more than once except when NameNode loses metadata information. The reason could be this property which tells NameNode where to store its metadata information on disk is dfs.namenode.name.dir in your case its points to /tmp, so every time you restart your system the /tmp directory gets flushed and hence you have to format the NameNode again. So, make sure you point the property dfs.namenode.name.dir to a more persistent location (something like /hadoop/hdfs/namenode similar for datanode property/hadoop/hdfs/datanode) which does not get's cleared every time if you restart your system that will resolve this problem. I hope that the above answers your questions. Please accept the answer you found most useful.

jagadeesan · ‎12-28-2018

@Artyom Timofeev As for containers are stuck in localizing phase, seems you are hit on this reported bug on Yarn which is resolved in 3.0.0 version. https://issues.apache.org/jira/browse/YARN-6078

jagadeesan · ‎12-27-2018

@Michael Bronson After configuration changes, it's safe to restart required services, those restart will make necessary new changes into the system. In our case, yarn.nodemanager.local-dirs will point out to new location /grid/sdb/hadoop/yarn/local instead of old location /var/hadoop/yarn/local . In short, restart will not cause any issue either after delete old files or after change in YARN configuration. I hope this answered your concerns.

jagadeesan · ‎12-27-2018

@Michael Bronson Sure

jagadeesan · ‎12-27-2018

@Michael Bronson Yes true, similarly it's not a good idea to use /var for yarn.nodemanager.local-dirs which are container local. Typically, you can direct these to all the data mount points (like /grid/sdb/hadoop/yarn/local). Same thing for yarn logs (/grid/sdb/hadoop/yarn/log) yarn.nodemanager.log-dirs. This can help with reducing all your IO going to your OS disk (where you typically have /var). You can take a look at http://hortonworks.com/blog/resource-localization-in-yarn-deep-dive/ I hope that the above answers your questions. Please accept the answer you found most useful.

jagadeesan · ‎12-27-2018

@Michael Bronson Yes you can remove old files. Below article help you to with proper steps. https://community.hortonworks.com/articles/92339/how-to-clear-local-file-cache-and-user-cache-for-y.html Please accept the answer you found most useful.

jagadeesan · ‎12-27-2018

Problem Description: Ambari-infra-solr is running fine but using a "ps" command shows a password like below. According to security policy, this is consider as security breach.The issue occurred because the value of property infra_solr_trust_store_password and infra_solr_key_store_password showing cleartext passwords in java Options. $ ps -ef | grep -i 'ambari-infra' 1008 25938 1 21 07:25 ?00:00:11 /usr/jdk64/jdk1.8.0_112/bin/java -server -Xms1024m -Xmx2048m -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:/var/log/ambari-infra-solr/solr_gc.log -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=18886 -Dcom.sun.management.jmxremote.rmi.port=18886 -Djava.rmi.server.hostname=test2.example.com -DzkClientTimeout=60000 -DzkHost=test1.example.com:2181,test2.example.com:2181,test3.example.com:2181/infra-solr -Djetty.port=8886 -DSTOP.PORT=7886 -DSTOP.KEY=solrrocks -Dhost=test2.example.com -Duser.timezone=UTC -Djetty.home=/usr/lib/ambari-infra-solr/server -Dsolr.solr.home=/opt/ambari_infra_solr/data -Dsolr.install.dir=/usr/lib/ambari-infra-solr -Dlog4j.configuration=file:/etc/ambari-infra-solr/conf/log4j.properties -Dsolr.jetty.keystore=/etc/security/serverKeys/infra.solr.keyStore.jks -Dsolr.jetty.keystore.password=bigdata -Dsolr.jetty.truststore=/etc/security/serverKeys/infra.solr.trustStore.jks -Dsolr.jetty.truststore.password=bigdata -Dsolr.jetty.ssl.needClientAuth=false -Dsolr.jetty.ssl.wantClientAuth=false -Djavax.net.ssl.keyStore=/etc/security/serverKeys/infra.solr.keyStore.jks -Djavax.net.ssl.keyStorePassword=bigdata -Djavax.net.ssl.trustStore=/etc/security/serverKeys/infra.solr.trustStore.jks -Djavax.net.ssl.trustStorePassword=bigdata -Dsolr.jetty.https.port=8886 -Dsolr.authentication.httpclient.configurer=org.apache.solr.client.solrj.impl.Krb5HttpClientConfigurer -DauthenticationPlugin=org.apache.solr.security.KerberosPlugin -Djava.security.auth.login.config=/etc/ambari-infra-solr/conf/infra_solr_jaas.conf -Dsolr.kerberos.principal=HTTP/test2.example.com@EXAMPLE.COM -Dsolr.kerberos.keytab=/etc/security/keytabs/spnego.service.keytab -Dsolr.kerberos.cookie.domain=test2.example.com -Dsolr.kerberos.name.rules=DEFAULT -XX:OnOutOfMemoryError=/usr/lib/ambari-infra-solr/bin/oom_solr.sh 8886 /var/log/ambari-infra-solr -jar start.jar --module=https Article: This article help to set hash password instead of showing clearest passwords in java options. Using Ambari inbuilt jetty jar file, we can hash password either OBF or MD5 format and pass those value in infra-solo-env to hide password from ambari-infra solr process. Step-1: Generate encrypt password using jetty jar file, where <password> is the password you used for the keystore/truststore java -cp /usr/lib/ambari-infra-solr/server/lib/jetty-util-9.2.13.v20150730.jar org.eclipse.jetty.util.security.Password <password> java -cp /usr/lib/ambari-infra-solr/server/lib/jetty-util-9.2.13.v20150730.jar org.eclipse.jetty.util.security.Password bigdata 2018-12-27 07:51:13.605:INFO::main: Logging initialized @171ms bigdata OBF:1rpc1wtw1sp11sov1sop1wui1rpa MD5:27819cfe72583a34d13a40bb74154c91 Step-2: Update below properties from Ambari under Ambari Infra Config Tab in Advanced infra-solr-env section (You can mention hashed_password of either OBF or MD5 there) Before: SOLR_SSL_KEY_STORE_PASSWORD={{infra_solr_keystore_hashed_password}} SOLR_SSL_TRUST_STORE_PASSWORD={{infra_solr_truststore_hashed_password}} Now: SOLR_SSL_KEY_STORE_PASSWORD=OBF:1rpc1wtw1sp11sov1sop1wui1rpa SOLR_SSL_TRUST_STORE_PASSWORD=OBF:1rpc1wtw1sp11sov1sop1wui1rpa Step-3: Need to restart required services through ambari and verify with grep process of ambari-infra solr process. $ ps -ef | grep -i 'ambari-infra'1008 17641 17 08:03 ?00:00:10 /usr/jdk64/jdk1.8.0_112/bin/java -server -Xms1024m -Xmx2048m -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:/var/log/ambari-infra-solr/solr_gc.log -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=18886 -Dcom.sun.management.jmxremote.rmi.port=18886 -Djava.rmi.server.hostname=test2.example.com -DzkClientTimeout=60000 -DzkHost=test1.example.com:2181,test2.example.com:2181,test3.example.com:2181/infra-solr -Djetty.port=8886 -DSTOP.PORT=7886 -DSTOP.KEY=solrrocks -Dhost=test2.example.com -Duser.timezone=UTC -Djetty.home=/usr/lib/ambari-infra-solr/server -Dsolr.solr.home=/opt/ambari_infra_solr/data -Dsolr.install.dir=/usr/lib/ambari-infra-solr -Dlog4j.configuration=file:/etc/ambari-infra-solr/conf/log4j.properties -Dsolr.jetty.keystore=/etc/security/serverKeys/infra.solr.keyStore.jks -Dsolr.jetty.keystore.password=OBF:1rpc1wtw1sp11sov1sop1wui1rpa -Dsolr.jetty.truststore=/etc/security/serverKeys/infra.solr.trustStore.jks -Dsolr.jetty.truststore.password=OBF:1rpc1wtw1sp11sov1sop1wui1rpa -Dsolr.jetty.ssl.needClientAuth=false -Dsolr.jetty.ssl.wantClientAuth=false -Djavax.net.ssl.keyStore=/etc/security/serverKeys/infra.solr.keyStore.jks -Djavax.net.ssl.keyStorePassword=OBF:1rpc1wtw1sp11sov1sop1wui1rpa -Djavax.net.ssl.trustStore=/etc/security/serverKeys/infra.solr.trustStore.jks -Djavax.net.ssl.trustStorePassword=OBF:1rpc1wtw1sp11sov1sop1wui1rpa -Dsolr.jetty.https.port=8886 -Dsolr.authentication.httpclient.configurer=org.apache.solr.client.solrj.impl.Krb5HttpClientConfigurer -DauthenticationPlugin=org.apache.solr.security.KerberosPlugin -Djava.security.auth.login.config=/etc/ambari-infra-solr/conf/infra_solr_jaas.conf -Dsolr.kerberos.principal=HTTP/test2.example.com@EXAMPLE.COM -Dsolr.kerberos.keytab=/etc/security/keytabs/spnego.service.keytab -Dsolr.kerberos.cookie.domain=test2.example.com -Dsolr.kerberos.name.rules=DEFAULT -XX:OnOutOfMemoryError=/usr/lib/ambari-infra-solr/bin/oom_solr.sh 8886 /var/log/ambari-infra-solr -jar start.jar --module=https Ambari will automatically decrypt password with inbuilt jetty jar. For more details of jetty you can refer following link, https://wiki.eclipse.org/Jetty/Howto/Secure_Passwords

jagadeesan · ‎12-27-2018

@Dukool SHarma The MapReduce sort the intermediate data(between mapper and reducer phase) by key by default. If we want the data should be sort based on value, then we need secondary sorting. For more Information you can reference below links: https://www.oreilly.com/library/view/data-algorithms/9781491906170/ch01.html https://www.quora.com/What-is-secondary-sort-in-Hadoop-and-how-does-it-work/answer/Sudarshan-Sreenivasan-1 Please accept the answer you found most useful.

jagadeesan · ‎12-26-2018

For your question, answer is No. As of now we can't mange memory of individual node level using percentage. Currently yarn support only CPU, using below configuration. yarn.nodemanager.resource.percentage-physical-cpu-limit 100 Percentage of CPU that can be allocated for containers. This setting allows users to limit the amount of CPU that YARN containers use. Currently functional only on Linux using cgroups. The default is to use 100% of CPU.

Online	Offline
Last Visited	‎10-03-2025 07:58 AM

Member Since	‎11-12-2018 10:00 AM
Last Visited	‎10-03-2025 07:58 AM
Posts	218
Kudos received	179

Cloudera Community

Re: Migrating workloads from Spark 2 to Spark 3

Re: Looking for a supported version of Spark 3 for...

Re: Spark 3 Parcel Compatibility with CDP Private ...

Re: Apache Storm support in Cloudera

Re: Complete example for using spark MLlib for twi...

Re: Yarn jobs are getting stuck in ACCEPTED state

Re: How can i start Namenode service of Hadoop wit...

Re: Yarn jobs are getting stuck in ACCEPTED state

Re: yarn local dirs + safety moving the local dir ...

Re: YARN LOCAL DIRS + deletion

Re: yarn local dirs + safety moving the local dir ...

Re: YARN LOCAL DIRS + deletion

Hide Password from ambari-infra-solr process

Re: In Mapreduce how to sort intermediate output b...

Re: yarn configure to use all ram