Member since 
    
	
		
		
		08-08-2013
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                339
            
            
                Posts
            
        
                132
            
            
                Kudos Received
            
        
                27
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 16112 | 01-18-2018 08:38 AM | |
| 2014 | 05-11-2017 06:50 PM | |
| 10425 | 04-28-2017 11:00 AM | |
| 4142 | 04-12-2017 01:36 AM | |
| 3224 | 02-14-2017 05:11 AM | 
			
    
	
		
		
		02-28-2019
	
		
		08:25 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @Rodrigo Hjort ,  did you solve this problem and if yes, how ? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-19-2018
	
		
		08:49 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @SATHIYANARAYANA KUMAR.N ,  you can keep your pipeline if you need to and write intermediate output (after each processing) either with Spark into HDFS again, or by using Hive into another table.  From what you are describing, it sounds like a huge (and useless) overhead to split your huge files, put it into a RDBMS, grab it from there into AMQ and process it from there....that is ways to expensive/complicated/error-prone.  Just upload your huge files to HDFS and e.g. create a directory structure which reflects your processing pipeline, like  /data/raw  /data/layer1  /data/layer2  ...and put your output after each processing into it accordingly  HTH, Gerd 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-05-2018
	
		
		06:36 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @SATHIYANARAYANA KUMAR.N ,  some details are missing in your post, but as an general answer: if you want to do a batch processing of some huuuge files, Kafka is the wrong tool to use. Kafka's strength is managing STREAMING data.  Based on your description I am assuming that your use-case is, bringing huge files to HDFS and process it afterwards. For that I won't split the files at all, just upload it as a whole (e.g. via WebHDFS). Then you can use tools like Hive/Tez, Spark, ... to process your data (whatever you mean with "process", clean/filter/aggregate/merge/...or at the end "analyze" in an sql-like manner)  HTH, Gerd 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-10-2018
	
		
		06:24 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @Mudit Kumar ,  for adding your users you need to create principals for them in the Kerberos database.  e.g. connect to the node where MIT-KDC is running, then  sudo kadmin.local "addprinc <username>"   #replace <username> by your real usernames  So that you are able to grab a valid Kerberos ticket for those 5 users. You can verify this by executing  kinit <username>  this should ask for the corresponding password of that user (!! the password you provided at creation time of the principal above !!), followed by   klist  After grabbing a Kerberos ticket you can start executing commands to the cluster, like "hdfs dfs -ls"  If you have enabled authorization as well, you have to add those new users to the ACLs appropriately. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-18-2018
	
		
		08:38 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi,  that indicates your os user "root" is not the superuser of HDFS (root is just the "superuser" of the operating system).  Try to do the same as user "hdfs" (which is by default the hdfs superuser), as root do:     su - hdfs
hdfs dfsadmin -report  Basically, the HDFS superuser is the user, under which account the Namenode is started.     Alternatively you can add the os-user "root" to the group which is set as hdfs supergroup. Check for property  dfs.permissions.supergroup  and add "root" to this group (which points to an os group)     HTH, Gerd 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-14-2017
	
		
		04:06 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi,  after enabling SASL_PLAINTEXT listener on kafka it is no longer possible to use console-consumer/-producer. Whereas using a simple Java snippet to create a producer and adding some messages, it works fine, by using the exact same user/password as used for the console-clients:  public class SimpleProducer {
    public static void main(String[] args) throws Exception{
        if(args.length == 0){
            System.out.println("Enter topic name");
            return;
        }
        String topicName = args[0].toString();
        Properties props = new Properties();
        props.put("bootstrap.servers", "<brokernode>:6666");
        props.put("acks", "1");
        props.put("retries", 0);
        props.put("batch.size", 16384);
        props.put("linger.ms", 1);
        props.put("buffer.memory", 33554432);
        props.put("key.serializer",
            "org.apache.kafka.common.serialization.StringSerializer");
        props.put("value.serializer",
            "org.apache.kafka.common.serialization.StringSerializer");
        ////// AUTHENTICATION
        props.put("security.protocol","SASL_PLAINTEXT");
        props.put("sasl.mechanism","PLAIN");
        props.put("sasl.jaas.config",
            "org.apache.kafka.common.security.plain.PlainLoginModule required\n" +
            "username=\"kafka\"\n" +
            "password=\"kafkaSecure\";");
        ////// END AUTHENTICATION
        Producer<String, String> producer = new KafkaProducer<String, String>(props);
        System.out.println("producer created");
        for(int i = 0; i < 10; i++) {
            System.out.println("message"+i);
            producer.send(new ProducerRecord<String, String>(topicName,
                Integer.toString(i), Integer.toString(i)));
        }
        System.out.println("Messages sent successfully");
        producer.close();
    }
}  After starting the e.g. producer and trying to add a message via the console, the following message is shown (endless):  [2017-11-14 16:48:23,039] WARN Bootstrap broker <brokernode>:6666 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-11-14 16:48:23,091] WARN Bootstrap broker <brokernode>:6666 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-11-14 16:48:23,143] WARN Bootstrap broker <brokernode>:6666 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-11-14 16:48:23,195] WARN Bootstrap broker <brokernode>:6666 disconnected (org.apache.kafka.clients.NetworkClient)  Kafka config looks like:  listeners=PLAINTEXT://<brokernode>:6667,SASL_PLAINTEXT://<brokernode>:6666  sasl.enabled.mechanisms=PLAIN  sasl.mechanism.inter.broker.protocol=PLAIN  security.inter.broker.protocol=SASL_PLAINTEXT  The console-producer gets started via:  export KAFKA_OPTS="-Djava.security.auth.login.config=/etc/kafka/conf/user_kafka_jaas.conf" ; /usr/hdf/current/kafka-broker/bin/kafka-console-producer.sh --broker-list <brokernode>:6666 --topic gk-test --producer.config /etc/kafka/conf/producer.properties  where the property files look like:  /etc/kafka/conf/user_kafka_jaas.conf  KafkaClient {
  org.apache.kafka.common.security.plain.PlainLoginModule required
  username="kafka"
  password="kafkaSecure";
};  /etc/kafka/conf/producer.properties  security.protocol=SASL_PLAINTEXT  sasl.mechanism=PLAIN  Any hint on what is going wrong with console-producer and console-consumer to not being able to produce/consume from topic ? ...but the Java snippet works...  Thanks 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Kafka
			
    
	
		
		
		11-09-2017
	
		
		10:36 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Bryan, many thanks for your explanation.  Do you have any resources/hints regarding "creating a dynamic JAAS file", how this would look like ? ....assuming Kerberos is enabled 😉  ...or do you mean by 'dynamic' the possibility to specify principal&keytab within the Kafka processor?  Thanks! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-08-2017
	
		
		01:07 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi,  how can I enable Kafka SASL_PLAINTEXT auth, without enabling Kerberos in general ?!?!  Right now I added the additional "listener" entry and populated the "advanced kafka_jaas_conf" as well as "advanced kafka_client_jaas_conf".  After that the KafkaBrokers won't start up, because of error:  FATAL [Kafka Server 1001], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.apache.kafka.common.KafkaException: java.lang.IllegalArgumentException: Could not find a 'KafkaServer' entry in the JAAS configuration. System property 'java.security.auth.login.config' is not set  What else needs to be done to provide the required properties to Broker startup as well as to distribute the .jaas files ?   Also it looks like the .jaas files are not being deployed to the kafka nodes, they are not under /usr/hdp/current/kafka-broker/config. Is this functionality missing because of Kerberos is disabled ?!?! I am sure after enabling Kerberos the defined .jaas entries in Ambari will be deployed to the nodes, hence there must be some "hidden" functionality missing in non-Kerberos mode....  Any help appreciated, thanks in advance... 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Ambari
- 
						
							
		
			Apache Kafka
			
    
	
		
		
		11-08-2017
	
		
		10:01 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hello,  there is HDF setup done (HDF3.0) and now SASL_PLAINTEXT needs to be added to Kafka listeners (without Kerberos, just the plain sasl). To be enable to authenticate there needs to be user:pw tuples being provided in the .jaas file. But this looks very static.  How can the enduser (who is logged in into NiFi) being used in a Kafka Processor to authenticate against Kafka ?  Is there a possibility with user defined properties to ensure that the current user is being used for authenticating against Kafka / or to dynamically decide which .jaas file needs to be used based on the current logged in user ?  Kerberos and SSL are currently not an option, hence need a solution for SASL_PLAINTEXT 😉  Thanks in advance... 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		10-20-2017
	
		
		06:33 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @Matt Clarke , thanks for your reply. Will dive back into this with the release you mentioned.  You're saying "no support of Ranger or LDAP Groups", but support of Ranger is already there, although limited to user-based policies. Or did I misunderstand something here ?!?! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













