Member since 
    
	
		
		
		06-20-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                488
            
            
                Posts
            
        
                433
            
            
                Kudos Received
            
        
                118
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 3602 | 08-25-2017 03:09 PM | |
| 2504 | 08-22-2017 06:52 PM | |
| 4194 | 08-09-2017 01:10 PM | |
| 8969 | 08-04-2017 02:34 PM | |
| 8946 | 08-01-2017 11:35 AM | 
			
    
	
		
		
		09-13-2016
	
		
		04:04 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Timothy Spann very effective answer, but similar information as @Randy Gelhausen and he was first in.   
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-13-2016
	
		
		10:53 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 This article shows how to use a list of urls in an external file to iterate InvokeHttp  https://community.hortonworks.com/content/kbentry/48816/nifi-to-ingest-and-transform-rss-feeds-to-hdfs-usi.html
  You can schedule GetFile to run once per day, week, etc.  If errors at the end of the flow inserting into a db, you can configure to ignore failure. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-12-2016
	
		
		10:01 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 I am trying to create phoenix interpreter using %jdbc in Zeppelin using 2.5 and am not succeeding.  Steps are:   Log into Zeppelin (sandbox 2.5)  Create new interpreter as follows
      restart (just to be paranoid)  go to my notebook and bind interpreter
      when I run with %jdbc(phoenix) I get Prefix not found.  when I run it with %jdbc.phoenix I get jdbc.phoenix interpreter not found   What am I missing? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Phoenix
- 
						
							
		
			Apache Zeppelin
			
    
	
		
		
		09-12-2016
	
		
		06:46 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Agree with @mqureshi @Constantin Stanca  Would like to add the theme that compression is a strategy and usually not a universal yes or no, or this codec or that.   Important questions to ask for your data are: Will it be processed frequently, rarely or never (cold storage)?  How critical is performance when it is processed?  Which leads to: Which file format/compression codec if any for each dataset?   The following are good references for compression and file format strategies (takes some thinking and evaluating):   http://www.slideshare.net/Hadoop_Summit/kamat-singh-june27425pmroom210cv2  http://comphadoop.weebly.com/  http://www.dummies.com/programming/big-data/hadoop/hadoop-for-dummies/   After formulating a strategy, think about dividing your hdfs filepaths into zones in accordance with your strategy. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-12-2016
	
		
		05:48 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Saumitra Buragohain could you help out here?  Could use your expertise 🙂 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-12-2016
	
		
		03:37 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 I have heard that full-dev-platform is being deprecated and vagrant-multinode should be used for a development envt instead: https://github.com/apache/incubator-metron/tree/master/metron-deployment/vagrant/multinode-vagrant  This is very resource intensive, so for dev the best option is: https://github.com/apache/incubator-metron/tree/master/metron-deployment/vagrant/quick-dev-platform  Also, ansible should be installed latest version and downgraded as follows: https://cwiki.apache.org/confluence/display/METRON/Downgrade+Ansible   
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-12-2016
	
		
		12:43 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 Please be sure to follow these instructions  https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4-Win/bk_HDP_Install_Win/content/LZOCompression.html  You can do step 3 from the Ambari  web UI.  Also, note that steps 1-2 need to be done on each node in the cluster. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-11-2016
	
		
		02:28 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 @Mohan V  Issue of jar version imcompatibility.  You need to use the following newer versions of elephant bird (and not the older version)  REGISTER elephant-bird-core-4.1.jar 
REGISTER elephant-bird-pig-4.1.jar 
REGISTER elephant-bird-hadoop-compat-4.1.jar  I tested it with your code and sample and it works.  You can get the jars at:   http://www.java2s.com/Code/JarDownload/elephant/elephant-bird-core-4.1.jar.zip
http://www.java2s.com/Code/JarDownload/elephant/elephant-bird-pig-4.1.jar.zip
http://www.java2s.com/Code/JarDownload/elephant/elephant-bird-hadoop-compat-4.1.jar.zip  Regarding DESCRIBE working but DUMP causing the issue: DUMP runs the map-reduce program and DESCRIBE does not. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-11-2016
	
		
		12:48 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Mohan V    Very glad to see you solved it yourself by debugging -- it is the best way to learn and improve your skills 🙂 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-10-2016
	
		
		12:31 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 There is a lot going on here -- when writing a complex script like this, the following approach is useful to build and debug:   run locally against a small subset of records (pig -x local -f <scriptOnLocalFileSystem>.pig).  This makes each instance of the script run faster.  build each statement line by line until you get to the failure statement (run the first statement, add the second and run, etc until it fails).  When it fails you need to focus on the last statement and fix it.  These steps are good for finding grammar issues (which it looks like you have based on the error statement).  If you also want to make sure your data is being processed correctly, put a DUMP statement after each line during each iteration.  That way you can inspect the results of each statement  If using inline statements like your grouped = statement, separate out at first until it works.  This makes the issue easier to isolate.   Let me know how that goes. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
         
					
				













