About Yuexin Zhang

Yuexin Zhang · ‎04-28-2019

You can use the Spark Action in Oozie to submit any spark applications: https://archive.cloudera.com/cdh5/cdh/5/oozie/DG_SparkActionExtension.html#Spark_Action If you are more familar with spark-submit tool, you can try to use oozie shell action as well: https://archive.cloudera.com/cdh5/cdh/5/oozie/DG_ShellActionExtension.html You may need to make sure the spark gateway role is deployed on the oozie server and node manager nodes, so that the runtime env always have the depencies available.

Yuexin Zhang · ‎04-22-2019

The error message shows you don't have a valid leader for the partition you are accessing. In kafka, all read/writes should go through the leader of that partition. You should make sure the topic/partitions have healthy leader first, run: kafka-topics --describe --zookeeper <zk_url, put /chroot if you have any> --topic <topic_name>

Joselyn · ‎01-29-2019

Hello, I am facing the same problem, could you help me, and give me more detail, I will appreciate it. Thanks in advance.

Yuexin Zhang · ‎11-11-2018

You may need to increase: yarn.nodemanager.resource.memory-mb yarn.scheduler.maximum-allocation-mb the default one could be too small to launch a default spark executor container ( 1024MB + 512 overhead). You may also want to enable INFO logging for the spark shell to understand what exact error/warn it has: /etc/spark/conf/log4j.properties

BoyScout · ‎09-26-2018

I posted an issue yesterday that relates to this -- the spark-submit classpath seems to conflict with commons-compress from a suppiled uber-jar. I've tried the --conf, --jar, and the --packages flags with spark-submit with no resolution. Spark 2.x + Tika: java.lang.NoSuchMethodError: org.apache.commons.compress.archivers.ArchiveStreamF Any help would be greatly appreciated!!!!

Hitesh88 · ‎07-13-2018

Hi Yuexin, Thanks for your response. I have gone through all these links and many more to research on this issue. And I have already done all this configuration of setting jaas file for both driver and executor and also setting kafka ssl settings in the kafkaparams in the program. $SPARK_HOME/bin/spark-submit \ --conf spark.yarn.queue=$yarnQueue \ --conf spark.hadoop.yarn.timeline-service.enabled=false \ --conf spark.yarn.archive=$sparkYarnArchive \ $sparkOpts \ --properties-file $sparkPropertiesFile \ --files /conf/kafka/kafka_client_jaas_dev.conf,/conf/kafka/krb5_dev.conf,/conf/keystore/kafka_client_truststore.jks,conf/kafka/kafka/kafkausr.keytab \ --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=kafka_client_jaas_dev.conf -Djava.security.krb5.conf=krb5_dev.conf -Dsun.security.krb5.debug=true" \ --driver-java-options "-Djava.security.auth.login.config=/conf/kafka_client_jaas_dev.conf -Djava.security.krb5.conf=/conf/krb5_dev.conf \ --class com.commerzbank.streams.KafkaHDFSPersister $moduleJar \ $1 $2 $3 $4 $KafkaParamsconfFile \ Problem here is running in yarn client mode (and all links you mentioned also talks about yarn cluster mode) but i have to run in yarn client mode only here due to some project constraints , and issue is if I specify full path of keystore file(where it is located on edge node where I run the command) in 'ssl.truststore.location' parameter then executors cannot find this file in their cache as it looks for complete path+file name and executor cache contains file with name 'kafka_client_truststore.jks'. And when i pass the keystore file (kafka_client_truststore.jks) without path to 'ssl.truststore.location' parameter then it fails for driver as driver looks in current path of edge node from where the job is run (and if I run the job from same directory /conf/keystore where this keystore file is present on edge node, then job succeeds) Is there way to solve this in your view or better way to load same set of files for driver and executors running in yarn client mode. Regards, Hitesh

Yuexin Zhang · ‎07-12-2018

This is usually caused by not having proper HADOOP or SPARK CONF on the node. You need to assign spark2 gateway role to this node, and deploy spark2 client configureations, then re-launch spark2-shell.

Florentin · ‎01-24-2018

Hi Suku: I response some of your questions: a) Which Keytab you have used, whether CM generated keytab or user keytab generated by you? I used kafka.keytab b) Path of your jaas.conf and keytab for Kafka? Path of kafka.keytab in /etc/security/keytabs/ c) How Kafka Kerberos configuration parameters set? The following is the configuration of Kafka parameters and the the form to use the jaas parameter. Properties props = new Properties(); props.put("bootstrap.servers", "xxxx:9092,xxx:9092"); props.put("client.id", "client-id-coprocessor "); props.put("key.serializer", StringSerializer.class.getName()); props.put("value.serializer", StringSerializer.class.getName()); props.put("security.protocol", "SASL_PLAINTEXT"); props.put("sasl.kerberos.service.name", "kafka"); props.put("sasl.jaas.config", "com.sun.security.auth.module.Krb5LoginModule required \n" + "useKeyTab=true \n" + "storeKey=true \n" + "keyTab=\"/etc/security/keytabs/kafka.keytab\" \n" + "principal=\"kafka/nodo@REALM\";"); KafkaProducer producer = new KafkaProducerString>(props); Remember sometimes you will need reboot your hbase service for deploy your coprocessor. I hope I will help you. Florentino

binghamc · ‎12-21-2017

We have validated the safety valve, as well as the solr URL. It seems that someone had deployed some examples a long timago and never cleaned them up so we had old information stored in collections listed in zookeeper. We deleted those and ran solrctrl init --force and that seemed to resolve the issue. We still however see the old dashboard entries in the search menu in Hue and they remain even after deleting them from the Hue database. We have an open case now for it.

neerjakhattar · ‎08-01-2017

Glad the issue the resolved but try to use some other location to store those jars instead of /opt/cloudera/parcels as when you upgrade you can lose all those jars.

Online	Offline
Last Visited	‎12-18-2024 11:06 AM

Member Since	‎03-01-2016 05:20 PM
Last Visited	‎12-18-2024 11:06 AM
Posts	609
Kudos received	9

Cloudera Community

Re: Incompatible Kafka Version

Re: CDS 3.3 support in CDP 7.1.7

Re: Time rules on capacity scheduler of YARN

Re: Is it possible to reserve whole nodes for excl...

Re: how to trigger sample spark and pyspark jobs ...

Re: how to trigger sample spark and pyspark jobs ...

Re: kakfa leader not available I have added 1 mor...

Re: Log4j.properties permission denied

Re: Spark-shell in Yarn mode gets stuck

Re: Where does spark-submit look for Jar files?

Re: Issue in running spark streaming job in yarn c...

Re: Unable to start Spark2-shell successfully

Re: Integrate kerberos with kafka in hbase coproce...

Re: SolrException: Error trying to proxy request f...

Re: Solr DataImporter class not found