Member since 
    
	
		
		
		07-31-2019
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                346
            
            
                Posts
            
        
                259
            
            
                Kudos Received
            
        
                62
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 3897 | 08-22-2018 06:02 PM | |
| 2229 | 03-26-2018 11:48 AM | |
| 5265 | 03-15-2018 01:25 PM | |
| 5632 | 03-01-2018 08:13 PM | |
| 1871 | 02-20-2018 01:05 PM | 
			
    
	
		
		
		11-22-2016
	
		
		07:33 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 Too add to what @Scott Shaw said, the biggest thing we'd be looking for initially is data skew.  So we can take a look at a couple things to help determine this.  The first is to take a look at the input size.  With input size, we can completely ignore the min, and take a look at the 25, median and 75th percentiles.  We see that in your job the are fairly close together, and we also the see the max is never dramatically more than the median.  If we saw the max and 75% percentile were very large, we would definitely see data skew.  Another indicator of data skew is the task duration. Again ignore the minimum, we're definitely going to inevitably get a small partition due to one reason or another.  Focus on the 25th median 75th and max.  In a perfect world the seperation between the 4 would be a tiny amount.  So seeing 6s, 10s, 11s, 17s, they may seem like significantly different but theyre actually relatively close.  The only time we would have a cause for concern would be when the 75% and max are quite a bit greater then 25% and median.  When I saw significant, I'm talking about most tasks take ~30s and the max taking 10 mins.  That would be a clear indicator of data skew. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-09-2016
	
		
		02:12 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks @Andrew Grande! That worked! I feel like a noob 🙂 but appreciate all the help! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-15-2016
	
		
		02:52 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hey @Daniel Rolls, no problem at all. I'm glad its working! I'm surprised Chrome isn't working though. I use Chrome by default and the views have worked fine so far. Thanks for the update! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-11-2017
	
		
		03:02 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi Scott,  Below is the error I am getting on when I am trying to perform ODBC data connection.   "UNABLE TO CONNECT"  Encountered an error while trying to connect to ODBC  Details: "ODBC: ERROR [HY000] [Hortonworks][Hardy] (34) Error from server: SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Internal credentials cache error).
ERROR [HY000] [Hortonworks][Hardy] (34) Error from server: SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Internal credentials cache error)."  I am able to sucssefully test the connection from Hortonworks Hive ODBC Driver DSN Setup  Thanks for any help 
:)  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-09-2016
	
		
		01:06 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							  + @jfrazee @Matt Burgess 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-26-2016
	
		
		05:14 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 That did the trick! Thanks @Constantin Stanca! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-01-2016
	
		
		04:59 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks, that answers all my questions.  I'd be all in HDInsight if MS would give me a free dev environment 🙂 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-20-2017
	
		
		08:41 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		8 Kudos
		
	
				
		
	
		
					
							 @Anurag Setia HDP windows only support server OSs, such as Windows server 2012 R2. Here's the list: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2-Win/bk_HDP_Install_Win/content/ref-9bdea823-d29d-47f2-9434-86d5460b9aa9.1.html. You also need to install required software packages prior installation: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2-Win/bk_HDP_Install_Win/content/ref-dc3ee968-ae3c-4c41-bb26-75a165180fb5.1.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-13-2016
	
		
		01:00 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks for information... 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-17-2017
	
		
		02:36 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I tried using the external table method but I run out of memory. My mongo collection (table2) has 10 million records (0.755 GB) and reading from it works. After the insert task fails I do a count on the native table (table1) and it contains 0 rows.  My query looks like this: "INSERT INTO table1 SELECT * FROM table2", if I add "LIMIT 1000" it works, however I need to migrate the entire collection. I attached the output from beeline. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













