Member since
08-08-2013
339
Posts
132
Kudos Received
27
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
14836 | 01-18-2018 08:38 AM | |
1578 | 05-11-2017 06:50 PM | |
9197 | 04-28-2017 11:00 AM | |
3444 | 04-12-2017 01:36 AM | |
2845 | 02-14-2017 05:11 AM |
02-28-2019
08:25 AM
Hi @Rodrigo Hjort , did you solve this problem and if yes, how ?
... View more
06-19-2018
08:49 PM
Hi @SATHIYANARAYANA KUMAR.N , you can keep your pipeline if you need to and write intermediate output (after each processing) either with Spark into HDFS again, or by using Hive into another table. From what you are describing, it sounds like a huge (and useless) overhead to split your huge files, put it into a RDBMS, grab it from there into AMQ and process it from there....that is ways to expensive/complicated/error-prone. Just upload your huge files to HDFS and e.g. create a directory structure which reflects your processing pipeline, like /data/raw /data/layer1 /data/layer2 ...and put your output after each processing into it accordingly HTH, Gerd
... View more
06-05-2018
06:36 PM
Hi @SATHIYANARAYANA KUMAR.N , some details are missing in your post, but as an general answer: if you want to do a batch processing of some huuuge files, Kafka is the wrong tool to use. Kafka's strength is managing STREAMING data. Based on your description I am assuming that your use-case is, bringing huge files to HDFS and process it afterwards. For that I won't split the files at all, just upload it as a whole (e.g. via WebHDFS). Then you can use tools like Hive/Tez, Spark, ... to process your data (whatever you mean with "process", clean/filter/aggregate/merge/...or at the end "analyze" in an sql-like manner) HTH, Gerd
... View more
05-10-2018
06:24 PM
Hi @Mudit Kumar , for adding your users you need to create principals for them in the Kerberos database. e.g. connect to the node where MIT-KDC is running, then sudo kadmin.local "addprinc <username>" #replace <username> by your real usernames So that you are able to grab a valid Kerberos ticket for those 5 users. You can verify this by executing kinit <username> this should ask for the corresponding password of that user (!! the password you provided at creation time of the principal above !!), followed by klist After grabbing a Kerberos ticket you can start executing commands to the cluster, like "hdfs dfs -ls" If you have enabled authorization as well, you have to add those new users to the ACLs appropriately.
... View more
01-18-2018
08:38 AM
1 Kudo
Hi, that indicates your os user "root" is not the superuser of HDFS (root is just the "superuser" of the operating system). Try to do the same as user "hdfs" (which is by default the hdfs superuser), as root do: su - hdfs
hdfs dfsadmin -report Basically, the HDFS superuser is the user, under which account the Namenode is started. Alternatively you can add the os-user "root" to the group which is set as hdfs supergroup. Check for property dfs.permissions.supergroup and add "root" to this group (which points to an os group) HTH, Gerd
... View more
11-14-2017
04:06 PM
Hi, after enabling SASL_PLAINTEXT listener on kafka it is no longer possible to use console-consumer/-producer. Whereas using a simple Java snippet to create a producer and adding some messages, it works fine, by using the exact same user/password as used for the console-clients: public class SimpleProducer {
public static void main(String[] args) throws Exception{
if(args.length == 0){
System.out.println("Enter topic name");
return;
}
String topicName = args[0].toString();
Properties props = new Properties();
props.put("bootstrap.servers", "<brokernode>:6666");
props.put("acks", "1");
props.put("retries", 0);
props.put("batch.size", 16384);
props.put("linger.ms", 1);
props.put("buffer.memory", 33554432);
props.put("key.serializer",
"org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer",
"org.apache.kafka.common.serialization.StringSerializer");
////// AUTHENTICATION
props.put("security.protocol","SASL_PLAINTEXT");
props.put("sasl.mechanism","PLAIN");
props.put("sasl.jaas.config",
"org.apache.kafka.common.security.plain.PlainLoginModule required\n" +
"username=\"kafka\"\n" +
"password=\"kafkaSecure\";");
////// END AUTHENTICATION
Producer<String, String> producer = new KafkaProducer<String, String>(props);
System.out.println("producer created");
for(int i = 0; i < 10; i++) {
System.out.println("message"+i);
producer.send(new ProducerRecord<String, String>(topicName,
Integer.toString(i), Integer.toString(i)));
}
System.out.println("Messages sent successfully");
producer.close();
}
} After starting the e.g. producer and trying to add a message via the console, the following message is shown (endless): [2017-11-14 16:48:23,039] WARN Bootstrap broker <brokernode>:6666 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-11-14 16:48:23,091] WARN Bootstrap broker <brokernode>:6666 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-11-14 16:48:23,143] WARN Bootstrap broker <brokernode>:6666 disconnected (org.apache.kafka.clients.NetworkClient)
[2017-11-14 16:48:23,195] WARN Bootstrap broker <brokernode>:6666 disconnected (org.apache.kafka.clients.NetworkClient) Kafka config looks like: listeners=PLAINTEXT://<brokernode>:6667,SASL_PLAINTEXT://<brokernode>:6666 sasl.enabled.mechanisms=PLAIN sasl.mechanism.inter.broker.protocol=PLAIN security.inter.broker.protocol=SASL_PLAINTEXT The console-producer gets started via: export KAFKA_OPTS="-Djava.security.auth.login.config=/etc/kafka/conf/user_kafka_jaas.conf" ; /usr/hdf/current/kafka-broker/bin/kafka-console-producer.sh --broker-list <brokernode>:6666 --topic gk-test --producer.config /etc/kafka/conf/producer.properties where the property files look like: /etc/kafka/conf/user_kafka_jaas.conf KafkaClient {
org.apache.kafka.common.security.plain.PlainLoginModule required
username="kafka"
password="kafkaSecure";
}; /etc/kafka/conf/producer.properties security.protocol=SASL_PLAINTEXT sasl.mechanism=PLAIN Any hint on what is going wrong with console-producer and console-consumer to not being able to produce/consume from topic ? ...but the Java snippet works... Thanks
... View more
Labels:
- Labels:
-
Apache Kafka
11-09-2017
10:36 AM
Bryan, many thanks for your explanation. Do you have any resources/hints regarding "creating a dynamic JAAS file", how this would look like ? ....assuming Kerberos is enabled 😉 ...or do you mean by 'dynamic' the possibility to specify principal&keytab within the Kafka processor? Thanks!
... View more
11-08-2017
01:07 PM
Hi, how can I enable Kafka SASL_PLAINTEXT auth, without enabling Kerberos in general ?!?! Right now I added the additional "listener" entry and populated the "advanced kafka_jaas_conf" as well as "advanced kafka_client_jaas_conf". After that the KafkaBrokers won't start up, because of error: FATAL [Kafka Server 1001], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.apache.kafka.common.KafkaException: java.lang.IllegalArgumentException: Could not find a 'KafkaServer' entry in the JAAS configuration. System property 'java.security.auth.login.config' is not set What else needs to be done to provide the required properties to Broker startup as well as to distribute the .jaas files ? Also it looks like the .jaas files are not being deployed to the kafka nodes, they are not under /usr/hdp/current/kafka-broker/config. Is this functionality missing because of Kerberos is disabled ?!?! I am sure after enabling Kerberos the defined .jaas entries in Ambari will be deployed to the nodes, hence there must be some "hidden" functionality missing in non-Kerberos mode.... Any help appreciated, thanks in advance...
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Kafka
11-08-2017
10:01 AM
Hello, there is HDF setup done (HDF3.0) and now SASL_PLAINTEXT needs to be added to Kafka listeners (without Kerberos, just the plain sasl). To be enable to authenticate there needs to be user:pw tuples being provided in the .jaas file. But this looks very static. How can the enduser (who is logged in into NiFi) being used in a Kafka Processor to authenticate against Kafka ? Is there a possibility with user defined properties to ensure that the current user is being used for authenticating against Kafka / or to dynamically decide which .jaas file needs to be used based on the current logged in user ? Kerberos and SSL are currently not an option, hence need a solution for SASL_PLAINTEXT 😉 Thanks in advance...
... View more
Labels:
10-20-2017
06:33 AM
Hi @Matt Clarke , thanks for your reply. Will dive back into this with the release you mentioned. You're saying "no support of Ranger or LDAP Groups", but support of Ranger is already there, although limited to user-based policies. Or did I misunderstand something here ?!?!
... View more