About KuldeepK

KuldeepK · ‎12-20-2016

@Vishal Prakash Shah I'm not 100% sure but I think this is expected as RM does not keep historical information of all the Applications. Main purpose of Yarn Application Time server is to maintain historical information(yarn.timeline-service.ttl-ms is the parameter for retention) about all the YARN jobs hence you see lot of results with Timeline API. Default value of yarn.timeline-service.ttl-ms is 2678400000 ms i.e. 31 days You can read more about timeline server here - https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/TimelineServer.html Hope this information helps!

KuldeepK · ‎12-20-2016

SYMPTOM Running java action via Oozie workflow fails with below error: Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.JavaMain], main() threw exception, Could not find Yarn tags property (mapreduce.job.tags) java.lang.RuntimeException: Could not find Yarn tags property (mapreduce.job.tags) at org.apache.oozie.action.hadoop.LauncherMainHadoopUtils.getChildYarnJobs(LauncherMainHadoopUtils.java:52) at org.apache.oozie.action.hadoop.LauncherMainHadoopUtils.killChildYarnJobs(LauncherMainHadoopUtils.java:87) at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:44) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:38) at org.apache.oozie.action.hadoop.JavaMain.main(JavaMain.java:36) . ROOT CAUSE Missing or Yarn related jar file conflict in Oozie sharelib. . RESOLUTION Complete the following steps as oozie user in the Oozie node: 1. Recreate Oozie sharelib using below command /usr/hdp/<hdp-version>/oozie/bin/oozie-setup.sh sharelib create -locallib /usr/hdp/<hdp-version>/oozie/oozie-sharelib.tar.gz -fs hdfs://<namenode-host>:8020 2. Update Oozie sharelib using below command oozie admin -oozie http://<oozie-host>:11000/oozie -sharelibupdate 3. Restart oozie service using Ambari and resubmit the workflow. . Note - If you have put any custom jars in Oozie sharelib, please make sure to copy them back again after re-creating Oozie sharelib.

KuldeepK · ‎12-20-2016

SYMPTOM Beeline fails with below error: $ beeline --verbose Beeline version 0.14.0.2.2.6.0-2800 by Apache Hive beeline> !connect jdbc:hive2://prodnode1.crazyadmins.com:10000/default;principal=hive/prodnode1.crazyadmins.com@CRAZYADMINS.COM scan complete in 8ms Connecting to jdbc:hive2://prodnode1.crazyadmins.com:10000/default;principal=hive/prodnode1.crazyadmins.com@CRAZYADMINS.COM Enter username for jdbc:hive2://prodnode1.crazyadmins.com:10000/default;principal=hive/prodnode1.crazyadmins.com@CRAZYADMINS.COM: kuldeepk Enter password for jdbc:hive2://prodnode1.crazyadmins.com:10000/default;principal=hive/prodnode1.crazyadmins.com@CRAZYADMINS.COM: SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.2.6.0-2800/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.2.6.0-2800/hive/lib/hive-jdbc-0.14.0.2.2.6.0-2800-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 16/02/02 00:35:55 [main]: ERROR transport.TSaslTransport: SASL negotiation failure javax.security.sasl.SaslException: No common protection layer between client and server at com.sun.security.sasl.gsskerb.GssKrb5Client.doFinalHandshake(GssKrb5Client.java:252) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:187) at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:507) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:264) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:190) at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:163) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:138) at org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:179) at org.apache.hive.beeline.Commands.connect(Commands.java:1078) at org.apache.hive.beeline.Commands.connect(Commands.java:999) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:45) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:936) at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:801) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:762) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:476) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:459) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAc . ROOT CAUSE SSL was enabled on this cluster for Hiveserver2 --> Further customer disabled it however forgot to revert below property hive.server2.thrift.sasl.qop=auth-conf . WORKAROUND N/A . RESOLUTION Revert value of this property as below via Ambari and restart required services. hive.server2.thrift.sasl.qop=auth

KuldeepK · ‎12-19-2016

@dvillarreal I would say divided nodes into 2 racks and update rack for each datanode from Ambari(configure rack topology). I don't have much insights on NodeGroup topology, are you referring to node labels? Please correct me if I'm wrong.

KuldeepK · ‎12-18-2016

In this post, we will see how to configure node labels on YARN. Before we go for the configuration part, let’s understand what is node label in YARN. Node labels allows us to divide our cluster in different parts and we can use those parts individually as per our requirements. More specifically, we can create a group of node-managers using node labels, for example group of node managers which are having high amount of RAM and use them to process only critical production jobs! This is cool, isn’t it? So lets see how we can configure node labels on YARN. . Types of node labels: Exclusive – In this type of node labels, only associated/mapped queues can access the resources of node label. Non Exclusive(sharable) – If resources are not in use for this node label then it can be shared with other running applications in a cluster. . Configuring node labels: . Step 1: Create required directory structure on HDFS Note – You can run below commands from any of the hdfs client. sudo su hdfs hadoop fs -mkdir -p /yarn/node-labels hadoop fs -chown -R yarn:yarn /yarn hadoop fs -chmod -R 700 /yarn . Step 2: Make sure that you have user directory for ‘yarn’ user on HDFS, if not then please create it using below commands Note – You can run below commands from any of the hdfs client. sudo su hdfs hadoop fs -mkdir -p /user/yarn hadoop fs -chown -R yarn:yarn /user/yarn hadoop fs -chmod -R 700 /user/yarn . Step 3: Configure below properties in yarn-site.xml via Ambari UI. If you don’t have Ambari UI, please add it manually to /etc/hadoop/conf/yarn-site.xml and restart required services. yarn.node-labels.enabled=true yarn.node-labels.fs-store.root-dir=hdfs://<namenode-host>:<namenode-rpc-port>/<complete-path_to_node_label_directory> Note – Please restart required services after above configuration changes! . Step 4: Create node labels using below commands sudo -u yarn yarn rmadmin -addToClusterNodeLabels "<node-label1>(exclusive=<true|false>),<node-label2>(exclusive=<true|false>)" For example, to add 2 node labels x and y: sudo -u yarn yarn rmadmin -addToClusterNodeLabels "x(exclusive=true),y(exclusive=false)" You can verify if node labels have been created by looking at Resource manager UI under ‘Node Lables’ option in the left pane or you can also run below command on any of the Yarn client yarn cluster --list-node-labels Sample output: [yarn@prodnode1 ~]$ yarn cluster --list-node-labels 16/12/14 15:45:56 INFO impl.TimelineClientImpl: Timeline service address: http://prodnode3.openstacklocal:8188/ws/v1/timeline/ 16/12/14 15:45:56 INFO client.RMProxy: Connecting to ResourceManager at prodnode3.openstacklocal/172.26.74.211:8050 Node Labels: <x:exclusivity=true>,<y:exclusivity=false> . Step 5: Allocate node labels to the node managers using below command: sudo -u yarn yarn rmadmin -replaceLabelsOnNode "<node-manager1>:<port>=<node-label1> <node-manager2>:<port>=<node-label2>" Example: sudo -u yarn yarn rmadmin -replaceLabelsOnNode "prodnode1.openstacklocal=x prodnode2.openstacklocal=y" Note – Don’t worry about port if you have only one node manager running per host. . Step 6: Map node labels to the queues: I have created 2 queues ‘a’ and ‘b’ in such a way that, queue ‘a’ can access nodes with label ‘x’ and ‘y’ where queue ‘b’ can only access the nodes with label ‘y’. By default, all the queues can access nodes with ‘default’ label. Below is my capacity scheduler configuration: yarn.scheduler.capacity.maximum-am-resource-percent=0.2 yarn.scheduler.capacity.maximum-applications=10000 yarn.scheduler.capacity.node-locality-delay=40 yarn.scheduler.capacity.queue-mappings-override.enable=false yarn.scheduler.capacity.root.a.a1.accessible-node-labels=x,y yarn.scheduler.capacity.root.a.a1.accessible-node-labels.x.capacity=30 yarn.scheduler.capacity.root.a.a1.accessible-node-labels.x.maximum-capacity=100 yarn.scheduler.capacity.root.a.a1.accessible-node-labels.y.capacity=50 yarn.scheduler.capacity.root.a.a1.accessible-node-labels.y.maximum-capacity=100 yarn.scheduler.capacity.root.a.a1.acl_administer_queue=* yarn.scheduler.capacity.root.a.a1.acl_submit_applications=* yarn.scheduler.capacity.root.a.a1.capacity=40 yarn.scheduler.capacity.root.a.a1.maximum-capacity=100 yarn.scheduler.capacity.root.a.a1.minimum-user-limit-percent=100 yarn.scheduler.capacity.root.a.a1.ordering-policy=fifo yarn.scheduler.capacity.root.a.a1.state=RUNNING yarn.scheduler.capacity.root.a.a1.user-limit-factor=1 yarn.scheduler.capacity.root.a.a2.accessible-node-labels=x,y yarn.scheduler.capacity.root.a.a2.accessible-node-labels.x.capacity=70 yarn.scheduler.capacity.root.a.a2.accessible-node-labels.x.maximum-capacity=100 yarn.scheduler.capacity.root.a.a2.accessible-node-labels.y.capacity=50 yarn.scheduler.capacity.root.a.a2.accessible-node-labels.y.maximum-capacity=100 yarn.scheduler.capacity.root.a.a2.acl_administer_queue=* yarn.scheduler.capacity.root.a.a2.acl_submit_applications=* yarn.scheduler.capacity.root.a.a2.capacity=60 yarn.scheduler.capacity.root.a.a2.maximum-capacity=60 yarn.scheduler.capacity.root.a.a2.minimum-user-limit-percent=100 yarn.scheduler.capacity.root.a.a2.ordering-policy=fifo yarn.scheduler.capacity.root.a.a2.state=RUNNING yarn.scheduler.capacity.root.a.a2.user-limit-factor=1 yarn.scheduler.capacity.root.a.accessible-node-labels=x,y yarn.scheduler.capacity.root.a.accessible-node-labels.x.capacity=100 yarn.scheduler.capacity.root.a.accessible-node-labels.x.maximum-capacity=100 yarn.scheduler.capacity.root.a.accessible-node-labels.y.capacity=50 yarn.scheduler.capacity.root.a.accessible-node-labels.y.maximum-capacity=100 yarn.scheduler.capacity.root.a.acl_administer_queue=* yarn.scheduler.capacity.root.a.acl_submit_applications=* yarn.scheduler.capacity.root.a.capacity=40 yarn.scheduler.capacity.root.a.maximum-capacity=40 yarn.scheduler.capacity.root.a.minimum-user-limit-percent=100 yarn.scheduler.capacity.root.a.ordering-policy=fifo yarn.scheduler.capacity.root.a.queues=a1,a2 yarn.scheduler.capacity.root.a.state=RUNNING yarn.scheduler.capacity.root.a.user-limit-factor=1 yarn.scheduler.capacity.root.accessible-node-labels=x,y yarn.scheduler.capacity.root.accessible-node-labels.x.capacity=100 yarn.scheduler.capacity.root.accessible-node-labels.x.maximum-capacity=100 yarn.scheduler.capacity.root.accessible-node-labels.y.capacity=100 yarn.scheduler.capacity.root.accessible-node-labels.y.maximum-capacity=100 yarn.scheduler.capacity.root.acl_administer_queue=* yarn.scheduler.capacity.root.b.accessible-node-labels=y yarn.scheduler.capacity.root.b.accessible-node-labels.y.capacity=50 yarn.scheduler.capacity.root.b.accessible-node-labels.y.maximum-capacity=100 yarn.scheduler.capacity.root.b.acl_administer_queue=* yarn.scheduler.capacity.root.b.acl_submit_applications=* yarn.scheduler.capacity.root.b.b1.accessible-node-labels=y yarn.scheduler.capacity.root.b.b1.accessible-node-labels.y.capacity=100 yarn.scheduler.capacity.root.b.b1.accessible-node-labels.y.maximum-capacity=100 yarn.scheduler.capacity.root.b.b1.acl_administer_queue=* yarn.scheduler.capacity.root.b.b1.acl_submit_applications=* yarn.scheduler.capacity.root.b.b1.capacity=100 yarn.scheduler.capacity.root.b.b1.maximum-capacity=100 yarn.scheduler.capacity.root.b.b1.minimum-user-limit-percent=100 yarn.scheduler.capacity.root.b.b1.ordering-policy=fifo yarn.scheduler.capacity.root.b.b1.state=RUNNING yarn.scheduler.capacity.root.b.b1.user-limit-factor=1 yarn.scheduler.capacity.root.b.capacity=60 yarn.scheduler.capacity.root.b.maximum-capacity=100 yarn.scheduler.capacity.root.b.minimum-user-limit-percent=100 yarn.scheduler.capacity.root.b.ordering-policy=fifo yarn.scheduler.capacity.root.b.queues=b1 yarn.scheduler.capacity.root.b.state=RUNNING yarn.scheduler.capacity.root.b.user-limit-factor=1 yarn.scheduler.capacity.root.capacity=100 yarn.scheduler.capacity.root.queues=a,b . Please visit http://crazyadmins.com/configure-node-labels-on-yarn/ for more details and FAQs. . Please comment if you need any further help on this. Happy Hadooping!! 🙂

KuldeepK · ‎12-18-2016

@Varun R Can you please try to remove /var/kerberos/krb5kdc/principal* files and try again? rm -rf /var/kerberos/krb5kdc/principal* service krb5kdc restart service kadmin restart If there is any issue with the restart, make sure to kill the processes and start it again. e.g. kill -9 <pid-of-krb5kdc> service krb5kdc start Hope this helps!

KuldeepK · ‎12-14-2016

@indrajeet gour Can you please post a separate question along with detailed stack trace?

KuldeepK · ‎12-12-2016

@Jose Molero Do you have resource manager HA cofigured? by looking at this error, it looks like your rm1 is standby and rm2 works fine. Can you please check?

KuldeepK · ‎12-11-2016

@Singh Pratap It looks like you are referring to wrong port for Mysql. By default port is 3306. Please find below correct command: sqoop import --connect jdbc:mysql://sandbox.hortonworks.com:3306/information_schema --username hive --password hive --table tables --target-dir sqoopdata If you want to debug more, you can enable debug log by exporting HADOOP_ROOT_LOGGER to DEBUG. e.g. export HADOOP_ROOT_LOGGER=DEBUG,console Then run your sqoop command. Sample output: [root@sandbox ~]# sqoop import --connect jdbc:mysql://sandbox.hortonworks.com:3306/information_schema --username hive --password hive --table tables --target-dir /tmp/ Warning: /usr/hdp/2.4.0.0-169/accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 16/12/11 16:29:02 DEBUG util.Shell: setsid exited with exit code 0 16/12/11 16:29:02 DEBUG sqoop.SqoopOptions: Generated nonce dir: /tmp/sqoop-root/compile/1d5abdbc51cca65d714f390cebf546da 16/12/11 16:29:02 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.0.0-169 16/12/11 16:29:02 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 16/12/11 16:29:02 DEBUG sqoop.ConnFactory: Loaded manager factory: org.apache.sqoop.manager.oracle.OraOopManagerFactory 16/12/11 16:29:02 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory 16/12/11 16:29:02 DEBUG sqoop.ConnFactory: Trying ManagerFactory: org.apache.sqoop.manager.oracle.OraOopManagerFactory 16/12/11 16:29:02 DEBUG oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop can be called by Sqoop! 16/12/11 16:29:02 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory 16/12/11 16:29:02 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:mysql: 16/12/11 16:29:02 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 16/12/11 16:29:02 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.MySQLManager@3976d4a4 16/12/11 16:29:02 INFO tool.CodeGenTool: Beginning code generation 16/12/11 16:29:02 DEBUG manager.SqlManager: Execute getColumnInfoRawQuery : SELECT t.* FROM `tables` AS t LIMIT 1 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 16/12/11 16:29:02 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection. 16/12/11 16:29:03 DEBUG manager.SqlManager: Using fetchSize for next query: -2147483648 16/12/11 16:29:03 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tables` AS t LIMIT 1 16/12/11 16:29:03 DEBUG manager.SqlManager: Found column TABLE_CATALOG of type [12, 512, 0] 16/12/11 16:29:03 DEBUG manager.SqlManager: Found column TABLE_SCHEMA of type [12, 64, 0] 16/12/11 16:29:03 DEBUG manager.SqlManager: Found column TABLE_NAME of type [12, 64, 0] 16/12/11 16:29:03 DEBUG manager.SqlManager: Found column TABLE_TYPE of type [12, 64, 0] 16/12/11 16:29:03 DEBUG manager.SqlManager: Found column ENGINE of type [12, 64, 0] 16/12/11 16:29:03 DEBUG manager.SqlManager: Found column VERSION of type [-5, 21, 0] 16/12/11 16:29:03 DEBUG manager.SqlManager: Found column ROW_FORMAT of type [12, 10, 0] 16/12/11 16:29:03 DEBUG manager.SqlManager: Found column TABLE_ROWS of type [-5, 21, 0] 16/12/11 16:29:03 DEBUG manager.SqlManager: Found column AVG_ROW_LENGTH of type [-5, 21, 0] 16/12/11 16:29:03 DEBUG manager.SqlManager: Found column DATA_LENGTH of type [-5, 21, 0] 16/12/11 16:29:03 DEBUG manager.SqlManager: Found column MAX_DATA_LENGTH of type [-5, 21, 0] 16/12/11 16:29:03 DEBUG manager.SqlManager: Found column INDEX_LENGTH of type [-5, 21, 0] 16/12/11 16:29:03 DEBUG manager.SqlManager: Found column DATA_FREE of type [-5, 21, 0] 16/12/11 16:29:03 DEBUG manager.SqlManager: Found column AUTO_INCREMENT of type [-5, 21, 0] .. Output truncated! Hope this helps! 🙂

KuldeepK · ‎12-11-2016

@Dmitry Otblesk - Please turn off maintenance mode for HDFS to allow it to start with other services after reboot.

Online	Offline
Last Visited	‎04-07-2022 05:11 PM

Member Since	‎04-03-2019 04:03 PM
Last Visited	‎04-07-2022 05:11 PM
Posts	962
Kudos received	1733

Cloudera Community

Re: oozie shell action

Re: Oozie Service Check fails after upgrading to ...

Re: oozie - mr container fails to start on rhel6 n...

Re: Not able to run docker container on yarn even ...

Re: Oozie Pig action doesn't appear in Tez UI

Re: What is the difference between timeline server...

Oozie workflow fails with an exception - Could not...

Unable to connect to Hiveserver2 using beeline - j...

Re: Do we have best practice, documentation or art...

Node labels configuration on Yarn

Re: kerberos setup fails

Re: Sqoop error while import from MySQL: SQLExcept...

Re: Failed to connect to server (port 8032) retri...

Re: not able to see the progress of my sqoop impor...

Re: No live data nodes in Sandbox