Member since 
    
	
		
		
		01-12-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                123
            
            
                Posts
            
        
                12
            
            
                Kudos Received
            
        
                1
            
            
                Solution
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 1939 | 12-12-2016 08:59 AM | 
			
    
	
		
		
		01-15-2019
	
		
		10:08 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi All,  Any Input on my clarifications?Faced this scenario one more time 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-24-2018
	
		
		11:37 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Which one will occur first in MapReduce Flow among shuffling and sorting?  To my knowledge shuffling will occur first and then Sorting? Correct me I am wrong.  Any body can explain these two things?  Below statement from the Definative guide:  MapReduce makes the guarantee that the input to every reducer is sorted by key. The process by which the system performs the sort—and transfers the map outputs to the reducers as inputs—is known as the shuffle. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Hadoop
 
			
    
	
		
		
		11-15-2018
	
		
		08:56 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Aditya Sirna   Do you mean if we are familiar with Python,We can Work on Spark.In Real time Projects only Python is sufficient.  Do I need to learn Scala or Java for real time Projects? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-15-2018
	
		
		05:47 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Could anybody guide me what is the learning path for Spark?
I am familiar with Hadoop,Hive,Pig,sqoop,oozie,Python and Hbase.I do not know much about Java.
Do I need to learn both Java and Scala to start with spark?
I am completed confused where to start for Spark? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Spark
 
			
    
	
		
		
		10-13-2018
	
		
		05:13 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I have set the No of reducers to 2 but still Hive is executing with 1.Any body help on this  
	set hive.exec.reducers.max=2  Hive (default)> insert overwrite directory '/input123456'
              > select count(*) from partitioned_user;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201810122125_0003, Tracking URL = http://ubuntu:50030/jobdetails.jsp?jobid=job_201810122125_0003
Kill Command = /home/naresh/Work1/hadoop-1.2.1/libexec/../bin/hadoop job  -kill job_201810122125_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2018-10-12 21:36:24,774 Stage-1 map = 0%,  reduce = 0%
2018-10-12 21:36:32,825 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 4.12 sec
2018-10-12 21:36:41,919 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 4.12 sec
2018-10-12 21:36:42,926 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.38 sec
MapReduce Total cumulative CPU time: 6 seconds 380 msec
Ended Job = job_201810122125_0003
Moving data to: /input123456
MapReduce Jobs Launched: 
Job 0: Map: 1  Reduce: 1   Cumulative CPU: 6.38 sec   HDFS Read: 354134 HDFS Write: 5 SUCCESS
Total MapReduce CPU Time Spent: 6 seconds 380 msec
OK
_c0
Time taken: 37.199 seconds
   
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Hive
 
			
    
	
		
		
		10-11-2018
	
		
		02:24 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 How to get the list of functions available in any jar file?  Let us say I have Piggybank.Jar.It contains Reverse,UnixToISO() etc.  Is there any command to get list of functions available in Jar file rather than using Google for it? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Pig
 
			
    
	
		
		
		03-07-2017
	
		
		08:37 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks for comments.I will do it definately starting from this post. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-03-2017
	
		
		08:48 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks for input.what is the problem with my relation C.  STRSPLIT will generate tuple as output.Here it will consists of two fields in a tuple.  (a1:chararray, a1of1:chararray) is also a tuple since it is enclosed in parentheses and also consists of two fields 
						
					
					... View more