Member since 
    
	
		
		
		08-03-2019
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                186
            
            
                Posts
            
        
                34
            
            
                Kudos Received
            
        
                26
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2767 | 04-25-2018 08:37 PM | |
| 6683 | 04-01-2018 09:37 PM | |
| 2157 | 03-29-2018 05:15 PM | |
| 7807 | 03-27-2018 07:22 PM | |
| 2662 | 03-27-2018 06:14 PM | 
			
    
	
		
		
		03-17-2018
	
		
		05:37 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @bernie zhang Can you try using raj_ops as username and password on Ambari login page? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-15-2018
	
		
		05:48 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 May be silly, but do you have any data at all in ngmss.company? Are you able to see it via Hive? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-15-2018
	
		
		05:47 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 When you say you want to connect to mysql service, do you mean you want to login to MySQL shell? Or you are trying to connect MySQL over http? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-14-2018
	
		
		04:30 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 There are a couple of issues that I can see with your script.  Your first statement that reads the data from the file.  my_data = LOAD 'customers.txt'usingPigStorage()as(name:chararray, age:int, eye_color:chararray, height:int);  You used PigStorage() method without any parameter. If you don't pass any parameter to this method, it will consider TAB as the delimiter. And looking at your data file, you have a comma as the delimiter. So your LOAD statement should look like follows.  my_data = LOAD 'customers.txt'usingPigStorage(',')as(name:chararray, age:int, eye_color:chararray, height:int);  This actually is not the problem that you are facing though. In your last statement, where you are creating the final_data relation, you referred to your columns as   SUM(brown_eyes) as num_brown_eyes,SUM(blue_eyes) as num_blue_eyes SUM(green_eyes) as num_green_eyes  This is incorrect. A describe statement should explain the schema to you.  A describe statement should explain the schema for you.  grunt> describe by_age; 
by_age: {group: int,my_data: {(name: chararray,age: int,eye_color: chararray,height: int)}}  You can see that all the columns are clubbed inside my_data column. So the reference to these columns should be made as mentioned below.  SUM(my_data.brown_eyes) as num_brown_eyes,SUM(my_data.blue_eyes) as num_blue_eyesSUM(my_data.green_eyes) as num_green_eyes  The same way you have used my_data.height in your code.  So you final generate statement should look like as follows.  final_data = FOREACH by_age GENERATE groupas age, COUNT(my_data)as num_people, AVG(my_data.height)as avg_height, SUM(my_data.brown_eyes)as num_brown_eyes, SUM(my_data.blue_eyes)as num_blue_eyes, SUM(my_data.green_eyes)as num_green_eyes;  All in all, your complete script should look like as shown below.  my_data = LOAD 'customers.txt'usingPigStorage(',')as(name:chararray, age:int, eye_color:chararray, height:int);
my_data = FOREACH my_data GENERATE name, age, height,(eye_color =='brown'?1:0) AS brown_eyes,(eye_color =='blue'?1:0) AS blue_eyes,(eye_color =='green'?1:0) AS green_eyes;
by_age =group my_data by age;
final_data = FOREACH by_age GENERATE groupas age, COUNT(my_data)as num_people, AVG(my_data.height)as avg_height, SUM(my_data.brown_eyes)as num_brown_eyes, SUM(my_data.blue_eyes)as num_blue_eyes, SUM(my_data.green_eyes)as num_green_eyes;  Now you know what were the issues, you will be able to run your script and also prevent those "typos" in future!  Happy coding! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-13-2018
	
		
		05:57 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi Jasper  You can use the JoltTransformJson processor to get it done. Follows how should your processor config look like.      Follows the complete Jolt specification.  [ 
 { "operation": "shift", 
 "spec": { 
  "agent-submit-time": "agent_submit_time", 
  "agent-end-time": "agent_end_time", 
  "agent-name": "agent_name", 
  "*": { 
   "@": "&" 
  } 
 } 
 } 
]   Follows a snippet of the output I got using your input.      Hope that helps. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-13-2018
	
		
		02:43 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I guess you need to drop these expressions one at a time. Using multiple ReplaceText processors. For example for the first pattern, you can use the replace text as follows.      Similarly, you can replace your patterns in the "Search Value" text box with following expressions.  (?s)(\\\"\[)
(?s)(\\)  Hope that helps! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-07-2018
	
		
		10:18 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 This is not a straightforward implementation. Though there are workarounds available. Have a look at this article on the community which talks about using a Stored Procedure to do the stuff. This may cost you some performance but will do the needful. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-07-2018
	
		
		02:32 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Dmitro Vasilenko The only thing your log is telling is GC information. Can you please share some more details about your query and job logs from yarn? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-07-2018
	
		
		02:30 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 You may want to look at this answer. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-05-2018
	
		
		02:46 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 "i put a csv file into hdfs location and do an alter table to add that new location to the partition". Can you please explain this operation? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		- « Previous
- Next »
 
        













