Member since 
    
	
		
		
		01-23-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                114
            
            
                Posts
            
        
                19
            
            
                Kudos Received
            
        
                4
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2810 | 03-26-2018 04:53 AM | |
| 31309 | 12-01-2017 07:15 AM | |
| 1259 | 11-28-2016 11:30 AM | |
| 2187 | 10-25-2016 11:26 AM | 
			
    
	
		
		
		04-10-2018
	
		
		07:09 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@ssathish Isn't it only for the currently running jobs? Do we able to see the job completed jobs containers and details.  Here is the Running job that shows Total Allocated Containers:running-containers.png  Here is the Completed Job that shows Total Allocated Containers: finished-job.png  But none of these Total Allocated Containers the get transformed to the REST API of RM. Below given XML's will show only the allocated contain  Running Job XML: running.xml  Finished Job XML: finished-job.xml  And the Node REST API:   curl http://<Nodemanager address>:<port>/ws/v1/node/containers/<containerID>  gives the containers details about only the running containers not about the completed containers.  Is there a way what we see on YARN Application UI https://manag003:8090/cluster/appattempt/appattempt_1522212350151_40488_000001 for the Total Allocated Containers:   to be transformed to REST API.  Thanks  Venkat   
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-09-2018
	
		
		04:39 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I'm following YARN REST API this shows:     allocatedMB  int  The sum of memory in MB allocated to the application’s running containers    allocatedVCores  int  The sum of virtual cores allocated to the application’s running containers     But these are the aggregated metrics.  I'm looking for the total containers, and for each container how much memory and vcores are allocated.  Is there a way this can be achieved?  Thanks  Venkat 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache YARN
			
    
	
		
		
		04-04-2018
	
		
		08:27 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Saumil Mayani Thanks a lot for the details. That makes it more clear. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-03-2018
	
		
		05:42 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 
	We have cluster with below CPU configuration: 
 
 
 
	 
		# lscpu
	 
 
 
	 
		Architecture:  x86_64
	 
 
 
	 
		CPU op-mode(s):  32-bit, 64-bit
	 
 
 
	 
		Byte Order:  Little Endian
	 
 
 
	 
		CPU(s):  56
	 
 
 
	 
		On-line CPU(s) list:  0-55
	 
 
 
	 
		Thread(s) per core:  2
	 
 
 
	 
		Core(s) per socket:  14
	 
 
 
	 
		Socket(s):  2
	 
 
 
	 
		NUMA node(s):  2
	 
 
 
	 
		Vendor ID:  GenuineIntel
	 
 
 
	 
		CPU family:  6
	 
 
 
	 
		Model:  79
	 
 
 
	 
		Stepping:  1
	 
 
 
	 
		CPU MHz:  2400.000
	 
 
 
	 
		BogoMIPS:  4794.00
	 
 
 
	 
		Virtualization:  VT-x
	 
 
 
	 
		L1d cache:  32K
	 
 
 
	 
		L1i cache:  32K
	 
 
 
	 
		L2 cache:  256K
	 
 
 
	 
		L3 cache:  35840K
	 
 
 
	 
		NUMA node0 CPU(s):  0-13,28-41
	 
 
 
	 
		NUMA node1 CPU(s):  14-27,42-55
	 
 
 
  We have 2 Physical Cores, 14 CPU's each, with Hyper threading 2 (physical) * 14 (cpu's each) * 2 (hyder threading) =  56  But the YARN Configs from Ambari shows 112 cores for the property yarn.nodemanager.resource.cpu-vcores (56*2)  This is being done at the Ambari Stack Advisor Code  The question here is are we by default doing the multiplication by 2  by assuming that the Hyper Threading is not enabled or are we considering that the CPU is capable of holding multiple containers and leaving the scope to admins to tune the environment based on CPU  or I/O work loads.  Thanks  Venkat 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
- 
						
							
		
			Apache YARN
			
    
	
		
		
		03-26-2018
	
		
		04:53 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							@toide Ambari 2.6.1.3 is no longer a valid version and the communication sent out by Hortonworks, The issued BUG's were fixed in 2.6.1.5 to avoid any potential issues.  Thanks  Venkat 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-25-2018
	
		
		07:29 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @sajid mohammed  
	this issue  is not related to ZEPPELIN-1263 , as this is related to User Impersonation with Zeppelin Spark interpreter.  
	You can mode more details in relation to this under: https://issues.apache.org/jira/browse/ZEPPELIN-3016 and the corresponding community article  Please note zeppelin gives the error:  ERROR [2017-10-2012:28:46,619]({pool-2-thread-5}RemoteScheduler.java[getStatus]:256)-Can't get status information  org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)  even for the following scenarios:  1) Log Directory not having permissions  2) User doesn't have folder/file level permissions  3) Jar files missing in the path  4) ENV variables missing   These are some of the scenarios i have seen this error with zeppelin Spark interpreter.  Thanks  Venkat 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-24-2018
	
		
		05:50 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Environment:  
	We are using EMR, with Spark 2.1 and EMR FS.  
	Process we are doing:  
	We are running a PySpark job to join 2 Hive tables and creating a another hive table based on this result using saveAsTable and storing it as a ORC with partitions  Issue:  
	18/01/23 10:21:28 INFO
OutputCommitCoordinator: Task was denied committing, stage: 84, partition: 901,
attempt: 10364
  
	18/01/23 10:21:28 INFO
TaskSetManager: Starting task 901.10365 in stage 84.0 (TID 212686, ip-172-31-46-97.ec2.internal,
executor 10, partition 901, PROCESS_LOCAL, 6235 bytes)
  
	18/01/23 10:21:28 WARN
TaskSetManager: Lost task 884.10406 in stage 84.0 (TID 212677,
ip-172-31-46-97.ec2.internal, executor 85): TaskCommitDenied (Driver denied
task commit) for job: 84, partition: 884, attemptNumber: 10406
  
	This  specific log information is recursive from the Spark logs and by the time we killed the job we have seen this for about ~170000 (160595) times as given in spark-task-commit-denied.jpg  
	From the source code it shows this:  /** * :: DeveloperApi :: * Task requested the driver to commit, but was denied. */
@DeveloperApicase class TaskCommitDenied
(    jobID: Int,    
partitionID: Int,  
attemptNumber: Int) extends TaskFailedReason { 
 override def toErrorString: String = s"TaskCommitDenied (Driver denied task commit)" +    
s" for job: $jobID, partition: $partitionID, attemptNumber: $attemptNumber" 
 /**   * If a task failed because its attempt to commit was denied, do not count this failure   * towards failing the stage. This is intended to prevent spurious stage failures in cases   * where many speculative tasks are launched and denied to commit.   */ 
 override def countTowardsTaskFailures: Boolean = false
}  Please note we have not enabled spark.speculation i.e. (it is false) and from the spark job Environment we have not seen this property at all.  But while the job is running we can see that the corresponding files are created under EMRFS under the table temp directories like:  hdfs://ip-172-31-18-155.ec2.internal:8020/hive/location/hive.db/hivetable/_temporary/0/task_1513431588574_1185_3_01_000000/00000_0.orc  we can see these kind of folders about 2001 ( as we have given the spark.sql.shuffle.partitions = 2001)   Question(s):  1) What could cause the job to get launch ~170000 tasks even though we have not enabled spark.speculation  2) When it has completed writing the data to HDFS (EMRFS) why each executor is trying to launch new tasks  3) is there a way we can avoid this?  Thanks a lot for looking into this. any inputs related to this will help us a lot.  Venkat 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hive
- 
						
							
		
			Apache Spark
			
    
	
		
		
		12-22-2017
	
		
		12:45 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@Karan Alang Can you please test the same command with one broker at a time? i.e. instead of giving all the brokers to the --broker-list it looks like only host1:9093 is having the issue from this:  Error message :[2017-12-2119:48:49,846] WARN Fetching topic metadata with correlation id 11for topics [Set(mmtest4)]from broker [BrokerEndPoint(0,<host1>,9093)] failed (kafka.client.ClientUtils$)
java.io.EOFException 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-11-2017
	
		
		05:02 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@Manfred PAUL Yes, i was looking at only the current session.   Can you please check whether you have all the keytabs generated properly for all the services? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-11-2017
	
		
		04:58 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@Abhijit Nayak As given by @Jay Kumar SenSharma the JIRA ( https://issues.apache.org/jira/browse/AMBARI-19666 ) was a bug in Ambari 2.4.0, but your Ambari version is 2.5.0.3 which is fixed in this release as per the JIRA, please check the below as given by @Jay Kumar SenSharma  Also it might be a browser setting which might be interrupting the complete file download in between.  So please try using a different browser to see if the behaviour is persistent? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













