About geko

geko · ‎02-28-2019

Hi @Rodrigo Hjort , did you solve this problem and if yes, how ?

geko · ‎06-19-2018

Hi @SATHIYANARAYANA KUMAR.N , you can keep your pipeline if you need to and write intermediate output (after each processing) either with Spark into HDFS again, or by using Hive into another table. From what you are describing, it sounds like a huge (and useless) overhead to split your huge files, put it into a RDBMS, grab it from there into AMQ and process it from there....that is ways to expensive/complicated/error-prone. Just upload your huge files to HDFS and e.g. create a directory structure which reflects your processing pipeline, like /data/raw /data/layer1 /data/layer2 ...and put your output after each processing into it accordingly HTH, Gerd

geko · ‎06-05-2018

Hi @SATHIYANARAYANA KUMAR.N , some details are missing in your post, but as an general answer: if you want to do a batch processing of some huuuge files, Kafka is the wrong tool to use. Kafka's strength is managing STREAMING data. Based on your description I am assuming that your use-case is, bringing huge files to HDFS and process it afterwards. For that I won't split the files at all, just upload it as a whole (e.g. via WebHDFS). Then you can use tools like Hive/Tez, Spark, ... to process your data (whatever you mean with "process", clean/filter/aggregate/merge/...or at the end "analyze" in an sql-like manner) HTH, Gerd

geko · ‎05-10-2018

Hi @Mudit Kumar , for adding your users you need to create principals for them in the Kerberos database. e.g. connect to the node where MIT-KDC is running, then sudo kadmin.local "addprinc <username>" #replace <username> by your real usernames So that you are able to grab a valid Kerberos ticket for those 5 users. You can verify this by executing kinit <username> this should ask for the corresponding password of that user (!! the password you provided at creation time of the principal above !!), followed by klist After grabbing a Kerberos ticket you can start executing commands to the cluster, like "hdfs dfs -ls" If you have enabled authorization as well, you have to add those new users to the ACLs appropriately.

geko · ‎01-18-2018

Hi, that indicates your os user "root" is not the superuser of HDFS (root is just the "superuser" of the operating system). Try to do the same as user "hdfs" (which is by default the hdfs superuser), as root do: su - hdfs hdfs dfsadmin -report Basically, the HDFS superuser is the user, under which account the Namenode is started. Alternatively you can add the os-user "root" to the group which is set as hdfs supergroup. Check for property dfs.permissions.supergroup and add "root" to this group (which points to an os group) HTH, Gerd

geko · ‎11-14-2017

Hi, after enabling SASL_PLAINTEXT listener on kafka it is no longer possible to use console-consumer/-producer. Whereas using a simple Java snippet to create a producer and adding some messages, it works fine, by using the exact same user/password as used for the console-clients: public class SimpleProducer { public static void main(String[] args) throws Exception{ if(args.length == 0){ System.out.println("Enter topic name"); return; } String topicName = args[0].toString(); Properties props = new Properties(); props.put("bootstrap.servers", "<brokernode>:6666"); props.put("acks", "1"); props.put("retries", 0); props.put("batch.size", 16384); props.put("linger.ms", 1); props.put("buffer.memory", 33554432); props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer"); props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer"); ////// AUTHENTICATION props.put("security.protocol","SASL_PLAINTEXT"); props.put("sasl.mechanism","PLAIN"); props.put("sasl.jaas.config", "org.apache.kafka.common.security.plain.PlainLoginModule required\n" + "username=\"kafka\"\n" + "password=\"kafkaSecure\";"); ////// END AUTHENTICATION Producer<String, String> producer = new KafkaProducer<String, String>(props); System.out.println("producer created"); for(int i = 0; i < 10; i++) { System.out.println("message"+i); producer.send(new ProducerRecord<String, String>(topicName, Integer.toString(i), Integer.toString(i))); } System.out.println("Messages sent successfully"); producer.close(); } } After starting the e.g. producer and trying to add a message via the console, the following message is shown (endless): [2017-11-14 16:48:23,039] WARN Bootstrap broker <brokernode>:6666 disconnected (org.apache.kafka.clients.NetworkClient) [2017-11-14 16:48:23,091] WARN Bootstrap broker <brokernode>:6666 disconnected (org.apache.kafka.clients.NetworkClient) [2017-11-14 16:48:23,143] WARN Bootstrap broker <brokernode>:6666 disconnected (org.apache.kafka.clients.NetworkClient) [2017-11-14 16:48:23,195] WARN Bootstrap broker <brokernode>:6666 disconnected (org.apache.kafka.clients.NetworkClient) Kafka config looks like: listeners=PLAINTEXT://<brokernode>:6667,SASL_PLAINTEXT://<brokernode>:6666 sasl.enabled.mechanisms=PLAIN sasl.mechanism.inter.broker.protocol=PLAIN security.inter.broker.protocol=SASL_PLAINTEXT The console-producer gets started via: export KAFKA_OPTS="-Djava.security.auth.login.config=/etc/kafka/conf/user_kafka_jaas.conf" ; /usr/hdf/current/kafka-broker/bin/kafka-console-producer.sh --broker-list <brokernode>:6666 --topic gk-test --producer.config /etc/kafka/conf/producer.properties where the property files look like: /etc/kafka/conf/user_kafka_jaas.conf KafkaClient { org.apache.kafka.common.security.plain.PlainLoginModule required username="kafka" password="kafkaSecure"; }; /etc/kafka/conf/producer.properties security.protocol=SASL_PLAINTEXT sasl.mechanism=PLAIN Any hint on what is going wrong with console-producer and console-consumer to not being able to produce/consume from topic ? ...but the Java snippet works... Thanks

geko · ‎11-09-2017

Bryan, many thanks for your explanation. Do you have any resources/hints regarding "creating a dynamic JAAS file", how this would look like ? ....assuming Kerberos is enabled 😉 ...or do you mean by 'dynamic' the possibility to specify principal&keytab within the Kafka processor? Thanks!

geko · ‎11-08-2017

Hi, how can I enable Kafka SASL_PLAINTEXT auth, without enabling Kerberos in general ?!?! Right now I added the additional "listener" entry and populated the "advanced kafka_jaas_conf" as well as "advanced kafka_client_jaas_conf". After that the KafkaBrokers won't start up, because of error: FATAL [Kafka Server 1001], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) org.apache.kafka.common.KafkaException: java.lang.IllegalArgumentException: Could not find a 'KafkaServer' entry in the JAAS configuration. System property 'java.security.auth.login.config' is not set What else needs to be done to provide the required properties to Broker startup as well as to distribute the .jaas files ? Also it looks like the .jaas files are not being deployed to the kafka nodes, they are not under /usr/hdp/current/kafka-broker/config. Is this functionality missing because of Kerberos is disabled ?!?! I am sure after enabling Kerberos the defined .jaas entries in Ambari will be deployed to the nodes, hence there must be some "hidden" functionality missing in non-Kerberos mode.... Any help appreciated, thanks in advance...

geko · ‎11-08-2017

Hello, there is HDF setup done (HDF3.0) and now SASL_PLAINTEXT needs to be added to Kafka listeners (without Kerberos, just the plain sasl). To be enable to authenticate there needs to be user:pw tuples being provided in the .jaas file. But this looks very static. How can the enduser (who is logged in into NiFi) being used in a Kafka Processor to authenticate against Kafka ? Is there a possibility with user defined properties to ensure that the current user is being used for authenticating against Kafka / or to dynamically decide which .jaas file needs to be used based on the current logged in user ? Kerberos and SSL are currently not an option, hence need a solution for SASL_PLAINTEXT 😉 Thanks in advance...

geko · ‎10-20-2017

Hi @Matt Clarke , thanks for your reply. Will dive back into this with the release you mentioned. You're saying "no support of Ranger or LDAP Groups", but support of Ranger is already there, although limited to user-based policies. Or did I misunderstand something here ?!?!

Online	Offline
Last Visited	‎03-12-2020 05:23 AM

Member Since	‎08-08-2013 05:01 AM
Last Visited	‎03-12-2020 05:23 AM
Posts	339
Kudos received	133

Cloudera Community

Re: report: Access denied for user root. Superuser...

Re: configure kafka ssl failed

Re: SOLR + Kerberos + curl ==> Cannot find key of ...

Re: Sentry ACLs are not being applied

Re: Error while batch importing from HBase to Solr...

Re: ZKFC fails to start apparently due to expired ...

Re: Processing huge files using Kafka

Re: Processing huge files using Kafka

Re: Adding user to Kerberised MIT KDC cluster

Re: report: Access denied for user root. Superuser...

Kafka console-consumer/-producer not working after...

Re: HDF: NiFi Kafka authentication via SASL_PLAINT...

Ambari Kafka config, how to enable SASL_PLAINTEXT?

HDF: NiFi Kafka authentication via SASL_PLAINTEXT,...

Re: How to setup Ranger NiFi policy auth based on ...