Member since 
    
	
		
		
		06-20-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                488
            
            
                Posts
            
        
                433
            
            
                Kudos Received
            
        
                118
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 3601 | 08-25-2017 03:09 PM | |
| 2501 | 08-22-2017 06:52 PM | |
| 4192 | 08-09-2017 01:10 PM | |
| 8969 | 08-04-2017 02:34 PM | |
| 8946 | 08-01-2017 11:35 AM | 
			
    
	
		
		
		08-22-2017
	
		
		06:52 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 This Rest call will get you all host names of nodes in the cluster  http://your.ambari.server/api/v1/clusters/yourClusterName/hosts  See these links on the Ambari API  
 https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/hosts.md  https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/index.md  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-22-2017
	
		
		02:21 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 What happens when you try the url in your browser?  Which browser are you using?  Did you try another browser?  What happens when you use the ambari login url http://http://localhost:8080/#/login?  Which version of HDP sandbox? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-21-2017
	
		
		08:20 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Followed instructions and it worked.  Thanks @Sriharsha Chintalapani 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-15-2017
	
		
		07:27 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I have installed HDFS 3.0.1. I am using a PublishKafkaRecord processor in NiFI to access a schema via HortonworksSchemaRegistry.               I am getting the below error from PutKafkaRecord.    Caused by: com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field "validationLevel" (class com.hortonworks.registries.schemaregistry.SchemaMetadata), not marked as ignorable (6 known properties: "compatibility", "type", "name", "description", "evolve", "schemaGroup"])
 at [Source: {"schemaMetadata":{"type":"avro","schemaGroup":"test","name":"simple","description":"simple","compatibility":"BACKWARD","validationLevel":"ALL","evolve":true},"id":3,"timestamp":1502815970781}; line: 1, column: 140] (through reference chain: com.hortonworks.registries.schemaregistry.SchemaMetadataInfo["schemaMetadata"]->com.hortonworks.registries.schemaregistry.SchemaMetadata["validationLevel"])      How do I get NiFi to ignore the validationLevel attribute for the schema and not throw this error? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Kafka
- 
						
							
		
			Apache NiFi
- 
						
							
		
			Schema Registry
			
    
	
		
		
		08-14-2017
	
		
		11:27 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 @Bala Vignesh N V  You will need to do a little data engineering to prep your data for the hive table ... basically, replacing the pipe | with a comma.  You can do this easily in pig by running this script:  a = LOAD '/sourcefilepath' as (fullrecord:chararray);  
b = FOREACH x generate REPLACE(fullrecord, '\\|', ',');
STORE b INTO '/targetfilepath ' USING PigStorage (',');  You could also do this pipe replace in sed before loading to hdfs.  Pig is advantageous however because it will run in map-reduce or tez and be much faster (parallel processing) especially for large files.  The fact that you have some values that include the delimiter is a problem ... unless there is a clear pattern you will have to write a program that finds each record with too many delimiters and then write a script to replace these one by one (e.g replace 'new york, usa' with 'new york usa' . If you used pig, b = would have to be repeated for each such value with delim.  If you are unfamiliar with pig, this is a good tutorial to show how to implement the above https://hortonworks.com/tutorial/how-to-process-data-with-apache-pig/ 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-10-2017
	
		
		09:49 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Simple SAM flow: Kafka -> (Storm) filter -> Kafka  Fails at Storm, which reports: com.hortonworks.registries.schemaregistry.serde.SerDesException: Unknown protocol id [123] received while deserializing the payload at com.hortonworks.registries.schemaregistry.serdes.avro.AvroSnapsh  Wondering what could cause this.  (Schema seems properly configured) 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		08-10-2017
	
		
		01:57 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 I think this link will show you how to change INFO to WARN or ERROR, as well as adjust rotating log sizes  https://community.hortonworks.com/content/supportkb/49455/atlas-default-logging-is-filling-up-file-system-ho.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-09-2017
	
		
		04:10 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Streaming the data to hadoop does consume a lot of CPU even though the data is only passing through. Putting the client on the edge node isolates this  and thus prevents CPU contention on the cluster doing jobs.  The edge node typically is used for client implementation  a) to isolate users from logging into master or worker nodes, and b) for isolating resource usage as with CPU with Sqoop. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-09-2017
	
		
		01:10 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		3 Kudos
		
	
				
		
	
		
					
							 All good questions and fortunately the answer is very simple: all data passes through the edge node with no staging or landing there.  Even better, the data passes directly to hadoop where it performs a map-reduce job (all mappers, no reducers) to import the rows in parallel.  Useful refs:   https://cwiki.apache.org/confluence/display/SQOOP/Sqoop+MR+Execution+Engine  https://blogs.apache.org/sqoop/entry/apache_sqoop_overview  https://www.packtpub.com/mapt/book/big_data_and_business_intelligence/9781784396688/6/ch06lvl1sec59/sqoop-2-architecture  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-05-2017
	
		
		05:44 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 HDP 2.6 allows {user} variable in Ranger policies, e.g. row-level filtering.     https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_security/content/user_variable_ref.html  https://community.hortonworks.com/questions/102532/set-user-user-in-ranger-policy.html   Are there any other variables besides {user} available, perhaps group?   
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hive
- 
						
							
		
			Apache Ranger
 
         
					
				













