1973
            
            
                Posts
            
        
                1225
            
            
                Kudos Received
            
        
                124
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 1823 | 04-03-2024 06:39 AM | |
| 2835 | 01-12-2024 08:19 AM | |
| 1570 | 12-07-2023 01:49 PM | |
| 2321 | 08-02-2023 07:30 AM | |
| 3204 | 03-29-2023 01:22 PM | 
			
    
	
		
		
		05-11-2016
	
		
		07:39 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Make sure Ambari Agent is installed and started.  Make sure firewall is off, passwordless SSH is on and working.  Restart ambari server.  Make sure SELINUX is off.  Make sure network ports are open. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-11-2016
	
		
		01:40 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 http://spark.apache.org/docs/latest/submitting-applications.html  --deploy-mode cluster   export HADOOP_CONF_DIR=XXX
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master yarn \
  --deploy-mode cluster \  # can be client for client mode
  --executor-memory 20G \
  --num-executors 50 \
  /path/to/examples.jar \
  1000  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-10-2016
	
		
		03:45 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I am wondering about a full open source solution for Master Data Management. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-10-2016
	
		
		02:39 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 So we have 100 different spreadsheets in CSV format with 20 fields.  The fields are kind of standard, but some people use First Name, some use Name or firstname, some use one name field.   Some use M and F for gender; some use 0 and 1.  We want to convert all these types of CSVs into one gold standard and standard fieldnames/types/rangers. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Spark
			
    
	
		
		
		05-04-2016
	
		
		05:06 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 excellent, let me know when that drops or is in alpha.   I will test it. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-04-2016
	
		
		02:49 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							     
 I ran the same flow myself and examined the AVRO file in HDFS using AVRO Cli.  
 Even though I didn't specify SNAPPY compression, it was there in the file.  [root@sandbox opt]# java -jar avro-tools-1.8.0.jar getmeta 23568764174290.avro
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
avro.codec snappy
avro.schema {"type":"record","name":"people","doc":"Schema generated by Kite","fields":[{"name":"id","type":"long","doc":"Type inferred from '2'"},{"name":"first_name","type":"string","doc":"Type inferred from 'Gregory'"},{"name":"last_name","type":"string","doc":"Type inferred from 'Vasquez'"},{"name":"email","type":"string","doc":"Type inferred from 'gvasquez1@pcworld.com'"},{"name":"gender","type":"string","doc":"Type inferred from 'Male'"},{"name":"ip_address","type":"string","doc":"Type inferred from '32.8.254.252'"},{"name":"company_name","type":"string","doc":"Type inferred from 'Janyx'"},{"name":"domain_name","type":"string","doc":"Type inferred from 'free.fr'"},{"name":"file_name","type":"string","doc":"Type inferred from 'NonMauris.xls'"},{"name":"mac_address","type":"string","doc":"Type inferred from '03-FB-66-0F-20-A3'"},{"name":"user_agent","type":"string","doc":"Type inferred from '\"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7;'"},{"name":"lat","type":"string","doc":"Type inferred from ' like Gecko) Version/5.0.4 Safari/533.20.27\"'"},{"name":"long","type":"double","doc":"Type inferred from '26.98829'"}]}
  It's hard coded in NIFI.  
 https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/main/java/org/apache/nifi/processors/kite/ConvertCSVToAvro.java  
 It always adds SnappyCompression to every AVRO file.  No options.  
 224   writer.setCodec(CodecFactory.snappyCodec()); 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-03-2016
	
		
		03:07 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Is the data encrypted when it leaves the edge device?    Use SSL transport and land encrypted in HDP. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-03-2016
	
		
		02:07 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 Good setup for Scala + SBT + Spark  https://hadoopist.wordpress.com/2016/02/03/how-to-setup-your-first-spark-project-in-intellij-ide/  And the Spark Team has a good setup here:  https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools#UsefulDeveloperTools-IDESetup 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-02-2016
	
		
		08:08 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 After you download the sandbox and HDF.   Take a look at these resources  https://dzone.com/articles/getting-started-with-apache-nifi-and-hdf  https://dzone.com/articles/anatomy-of-a-scala-spark-program  https://dzone.com/articles/hortonworks-top-15-links-of-april-2016 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-02-2016
	
		
		06:30 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Awesome.  I knew something awesome would come out of YASK.   Next NIFI demo has to create this article in HCC and promote in on social media. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













