Member since 
    
	
		
		
		04-03-2019
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                962
            
            
                Posts
            
        
                1743
            
            
                Kudos Received
            
        
                146
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 14934 | 03-08-2019 06:33 PM | |
| 6142 | 02-15-2019 08:47 PM | |
| 5085 | 09-26-2018 06:02 PM | |
| 12546 | 09-07-2018 10:33 PM | |
| 7392 | 04-25-2018 01:55 AM | 
			
    
	
		
		
		09-26-2018
	
		
		06:02 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @regis piccand / @kirk chou  I was able to resolve similar issue(java.io.IOException: Cannot run program "bash": error=2, No such file or directory) for our customer.  This happens because "/bin" and "/sbin" missing in your $PATH in container launch environment.  $PATH variable gets derived from nodemanager's env and nodemanager get's the env from ambari-agent's /var/lib/ambari-agent/ambari-env.sh.  To fix this, add "/bin" and "/sbin" in /var/lib/ambari-agent/ambari-env.sh, restart ambari-agent followed by nodemanager restart.  Happy Hadooping! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-07-2018
	
		
		10:33 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 We got it working by adding a tag to centos image with below commands:  docker tag centos local/centos  Here is the modified distributed shell command to run:  yarn jar $DJAR -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=local/centos -shell_command "sleep 120"-jar $DJAR -num_containers 1  Note - For multi-node cluster, you will have to run docker tag command on every node manager as root user.  Please also make sure that you have added "local" registry as trusted registry in yarn configurations.      Hope this helps!  Special thanks to @rmaruthiyodan 
    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-14-2018
	
		
		01:08 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Krishna Srinivas   Glad to know that this helped! 🙂 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-12-2018
	
		
		12:15 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Spark job fails with below error when byte code for any particular method grows beyond 64KB  spark.sql.codegen.wholeStage is enabled by default for internal optimization in Spark2 which can cause these kind of issues in some corner cases.  Below is the detailed stack trace for your reference:  org.codehaus.janino.JaninoRuntimeException: Code of method "processNext()V" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator" grows beyond 64 KB
at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:949)
at org.codehaus.janino.CodeContext.write(CodeContext.java:857)
at org.codehaus.janino.UnitCompiler.writeShort(UnitCompiler.java:11072)
at org.codehaus.janino.UnitCompiler.load(UnitCompiler.java:10744)
at org.codehaus.janino.UnitCompiler.load(UnitCompiler.java:10729)
at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:3824)
at org.codehaus.janino.UnitCompiler.access$9100(UnitCompiler.java:206)
at org.codehaus.janino.UnitCompiler$12.visitLocalVariableAccess(UnitCompiler.java:3796)
at org.codehaus.janino.UnitCompiler$12.visitLocalVariableAccess(UnitCompiler.java:3762)
at org.codehaus.janino.Java$LocalVariableAccess.accept(Java.java:3675)
at org.codehaus.janino.Java$Lvalue.accept(Java.java:3563)
at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3762)
at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:3820)
[....] Output truncated
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:782)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)  .  How to fix this?  This can be fixed by setting spark.sql.codegen.wholeStage=false in custom spark2-defaults configuration via Ambari and restart required services OR adding --conf spark.sql.codegen.wholeStage=false in spark-shell or spark-submit command.  .  Please comment if you have any feedback/questions/suggestions. Happy Hadooping!!    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		06-11-2018
	
		
		11:58 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Due to conflict in Jackson jar versions, Oozie job with spark2 action(spark action with spark2 sharelib) may get failed with below error:    2018-06-05 16:53:04,567 [Thread-20] INFO  org.apache.spark.SparkContext  - Created broadcast 0 from showString at NativeMethodAccessorImpl.java:0
  Traceback (most recent call last):
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/stg_gl_account_classification_master.py", line 9, in <module>
      gacm.show()
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 318, in show
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
  py4j.protocol.Py4JJavaError: An error occurred while calling o35.showString.
  : java.lang.ExceptionInInitializerError
  at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
  at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
  at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:225)
  at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:308)
  at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
  at org.apache.spark.sql.Dataset$anonfun$org$apache$spark$sql$Dataset$execute$1$1.apply(Dataset.scala:2386)
  at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
  at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2788)
  at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$execute$1(Dataset.scala:2385)
  at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$collect(Dataset.scala:2392)
  at org.apache.spark.sql.Dataset$anonfun$head$1.apply(Dataset.scala:2128)
  at org.apache.spark.sql.Dataset$anonfun$head$1.apply(Dataset.scala:2127)
  at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2818)
  at org.apache.spark.sql.Dataset.head(Dataset.scala:2127)
  at org.apache.spark.sql.Dataset.take(Dataset.scala:2342)
  at org.apache.spark.sql.Dataset.showString(Dataset.scala:248)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
  at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
  at py4j.Gateway.invoke(Gateway.java:280)
  at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
  at py4j.commands.CallCommand.execute(CallCommand.java:79)
  at py4j.GatewayConnection.run(GatewayConnection.java:214)
  at java.lang.Thread.run(Thread.java:745)
  Caused by: com.fasterxml.jackson.databind.JsonMappingException: Jackson version is too old 2.4.4
  at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:56)
  at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
  at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:549)
  at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
  at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
  ... 27 more  .  Why this error?  By default, 'oozie' directory in Oozie sharelib has jackson jars with 2.4.4 version and spark2 sharelib has latest versions of jackson jars.  .  To fix this error, please follow below steps:  Step 1: Move older jackson jars from default oozie sharelib to other directory:  hadoop fs -mv /user/oozie/share/lib/lib_<ts>/oozie/jackson*/user/oozie/share/lib/lib_<ts>/oozie.old  .  Step 2: Update oozie sharelib:  oozie admin -oozie http://<oozie-server-hostname>:11000/oozie -sharelibupdate  .  Please check this article for more details about oozie spark2 action.   .  Please comment if you have any feedback/questions/suggestions. Happy Hadooping!!    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		06-08-2018
	
		
		12:09 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Please follow below steps to run spark2 action via Oozie on HDP clusters.  https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/ch_oozie-spark-action.html  Your Oozie job may get failed with below error because of jar conflicts between 'oozie' sharelib and 'spark2' sharelib.  Error:  2018-06-04 13:27:32,652 WARN SparkActionExecutor:523 - SERVER[XXXX] USER[XXXX] GROUP[-] TOKEN[] APP[XXXX] JOB[0000000-<XXXXX>-oozie-oozi-W] ACTION[0000000-<XXXXXX>-oozie-oozi-W@spark2] Launcher exception: Attempt to add (hdfs://XXXX/user/oozie/share/lib/lib_XXXXX/oozie/aws-java-sdk-kms-1.10.6.jar) multiple times to the distributed cache. 
java.lang.IllegalArgumentException: Attempt to add (hdfs://XXXXX/user/oozie/share/lib/lib_20170727191559/oozie/aws-java-sdk-kms-1.10.6.jar) multiple times to the distributed cache. 
at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13$anonfun$apply$8.apply(Client.scala:632) 
at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13$anonfun$apply$8.apply(Client.scala:623) 
at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74) 
at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13.apply(Client.scala:623) 
at org.apache.spark.deploy.yarn.Client$anonfun$prepareLocalResources$13.apply(Client.scala:622) 
at scala.collection.immutable.List.foreach(List.scala:381) 
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:622) 
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:895) 
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:171) 
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1231) 
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1290) 
at org.apache.spark.deploy.yarn.Client.main(Client.scala) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:498) 
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:750) 
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) 
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) 
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) 
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 
at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:311) 
at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:232) 
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:58) 
at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:62) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:498) 
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:237) 
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) 
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)   .  Please run below commands to fix this error:  Note - You may need to take backup before running rm commands.  hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/aws* 
hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/azure* 
hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/hadoop-aws* 
hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/hadoop-azure* 
hadoop fs -rm /user/oozie/share/lib/lib_<ts>/spark2/ok*
hadoop fs -mv /user/oozie/share/lib/lib_<ts>/oozie/jackson* /user/oozie/share/lib/lib_<ts>/oozie.old   .  Please run below command to update Oozie sharelib:  oozie admin -oozie http://<oozie-server-hostname>:11000/oozie -sharelibupdate  .  Please comment if you have any feedback/questions/suggestions. Happy Hadooping!!    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		05-16-2018
	
		
		05:20 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@Bhushan Kandalkar Yes. you will have to import hive server certs into Hue's truststore.   Personally, I have never tried this however this link can give you more background. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-25-2018
	
		
		04:57 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Jeff Rosenberg - Done. Please accept my answer when you get a chance! Glad to know that your issue is resolved 🙂 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-25-2018
	
		
		01:55 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Jeff Rosenberg - Did you try passing tez-site.xml to pig action? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-25-2018
	
		
		01:43 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@Manikandan Jeyabal Please check below links  https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html  https://github.com/grisha/ruby-yarn 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













