<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Oryx2 Kafka Broker Issue in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53997#M18581</link>
    <description>&lt;P&gt;No problem, I'll get around to testing that out on Monday.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 21 Apr 2017 15:46:55 GMT</pubDate>
    <dc:creator>olabhrad</dc:creator>
    <dc:date>2017-04-21T15:46:55Z</dc:date>
    <item>
      <title>Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53810#M18564</link>
      <description>&lt;P&gt;Hi, I'm attempting to get the ALS example in Oryx2 up and running using an AWS EMR cluster.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;I've been using the requirements matrix here:&amp;nbsp;&lt;A href="http://oryx.io/docs/admin.html" target="_blank"&gt;http://oryx.io/docs/admin.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;(Aside: I see Oryx 2.4.x in the matrix but don't see a 2.4.x release here? &lt;A href="https://github.com/OryxProject/oryx/releases" target="_blank"&gt;https://github.com/OryxProject/oryx/releases&lt;/A&gt;)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My Kafka/Zookeeper hosts are&amp;nbsp;using the following versions:&lt;/P&gt;&lt;P&gt;Oryx 2.3.0&lt;BR /&gt;kafka_2.11-0.9.0.1&lt;/P&gt;&lt;P&gt;spark-2.0.1-bin-hadoop2.7&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I can run&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;./oryx-run.sh kafka-setup&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;and create the&amp;nbsp;topics successfully.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I can also run&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;./oryx-run.sh kafka-input --input-file data.csv&lt;/PRE&gt;&lt;P&gt;and see the messages being added to the topic.&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;My cluster&amp;nbsp;is using the following versions:&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Hadoop 2.7.3&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Spark 2.0.1&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;In addition, I've also included the following jars on the cluster (master and slave nodes):&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;kafka_2.11-0.9.0.1/libs/*&lt;/P&gt;&lt;P&gt;spark-streaming-kafka-0-8_2.11-2.0.1.jar&lt;/P&gt;&lt;P&gt;spark-core_2.11-2.0.1.jar&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;According to here:&lt;BR /&gt;&lt;A href="https://spark.apache.org/docs/2.0.1/streaming-kafka-integration.html" target="_blank"&gt;https://spark.apache.org/docs/2.0.1/streaming-kafka-integration.html&lt;/A&gt;&lt;BR /&gt;&lt;SPAN&gt;spark-streaming-kafka-0-8 is suitable for broker version "0.8.2.1 or higher".&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I have adapted&amp;nbsp;the spark-submit launch for the Batch layer into an EMR "step". The only difference I've had to make is to change&amp;nbsp;"deploy-mode" from client to cluster.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;aws emr add-steps --cluster-id XXX --steps Type=spark,Name=OryxBatchLayer-ALSExample,Args=[--master,yarn,--deploy-mode,cluster,--name,OryxBatchLayer-ALSExample,--class,com.cloudera.oryx.batch.Main,--files,s3://mybucket/oryx.conf,--driver-memory,1g,--driver-java-options,"-Dconfig.file=oryx.conf",--executor-memory,4g,--executor-cores,8,--conf,spark.executor.extraJavaOptions="-Dconfig.file=oryx.conf",--conf,spark.ui.port=4040,--conf,spark.io.compression.codec=lzf,--conf,spark.logConf=true,--conf,spark.serializer=org.apache.spark.serializer.KryoSerializer,--conf,spark.speculation=true,--conf,spark.ui.showConsoleProgress=false,--num-executors=4,s3://mybucket/oryx-batch-2.3.0.jar],ActionOnFailure=CONTINUE&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I then see the following error in the container:&lt;/P&gt;&lt;PRE&gt;17/04/18 14:15:00 ERROR JobScheduler: Error generating jobs for time 1492524900000 ms
java.lang.ClassCastException: kafka.cluster.BrokerEndPoint cannot be cast to kafka.cluster.Broker
	at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$2$$anonfun$3$$anonfun$apply$6$$anonfun$apply$7.apply(KafkaCluster.scala:97)
	at scala.Option.map(Option.scala:146)
	at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$2$$anonfun$3$$anonfun$apply$6.apply(KafkaCluster.scala:97)
	at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$2$$anonfun$3$$anonfun$apply$6.apply(KafkaCluster.scala:94)
	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Apparently this is caused by a mismatch with Kafka/Spark versions, but from what I can see I have followed the recommendations. Any ideas?&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 11:28:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53810#M18564</guid>
      <dc:creator>olabhrad</dc:creator>
      <dc:date>2022-09-16T11:28:26Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53823#M18565</link>
      <description>&lt;P&gt;First, I'll tell you that this is&amp;nbsp;Quite Complicated and confuses me too. Matching Spark and Kafka versions is tricky, exacerabated by multiple and incompatible Kafka APIs, multiplied by slight differences in which versions are shipped in what CDH package.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;(Yes there is no 2.4 yet, I put it in there as a 'preview'.)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I recall that I am actually&amp;nbsp;&lt;EM&gt;not&lt;/EM&gt; able to get the 2.3 release to pass tests with upstream components and that's why it builds with the CDH profile enabled by default. I wanted to move on to Kafka 0.9 to enable security support. But this is supported by Spark 2.x's kafka-0_10 integration component. And that wasn't yet available for CDH because it didn't work with the CDH Kafka 0.9. But the kafka-0_8 component did work. But then that component didn't work when enable with standard Spark 2's distro. This is a nightmarish no-mans-land of version combos.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However master (2.4) should be clearer since it moves on to Kafka 0.10.&amp;nbsp;It does work, or at least passes tests, vs Spark 2.1 and Kafka 0.10. In fact, I have a to-do to update the CDH dependencies too to get it working in master.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So: if you're making your own build for non-CDH components, can you try building 2.4 SNAPSHOT from master? If that's working I can hurry up getting the CDH part updated so we can cut a 2.4.0 release.&lt;/P&gt;</description>
      <pubDate>Tue, 18 Apr 2017 17:36:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53823#M18565</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2017-04-18T17:36:33Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53850#M18566</link>
      <description>&lt;P&gt;Hi srowen, thanks for the repsonse.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Below is my updated config:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;My Kafka/Zookeeper hosts are using the following versions:
Oryx 2.3.0
kafka_2.11-0.10.0.0
spark-2.0.1-bin-hadoop2.7


My cluster is using the following versions:
Hadoop 2.7.3
Spark 2.0.1
 
In addition, I've also included the following jars on the cluster (master and slave nodes):
kafka_2.11-0.10.0.0/libs/*
spark-streaming-kafka-0-10_2.11-2.0.1.jar
spark-core_2.11-2.0.1.jar&lt;/PRE&gt;&lt;P&gt;I've built&amp;nbsp;oryx-batch-2.4.0-SNAPSHOT.jar and&amp;nbsp;&lt;SPAN&gt;oryx-serving-2.4.0-SNAPSHOT.jar&amp;nbsp;&lt;/SPAN&gt;from master. The batch layer&amp;nbsp;appears to be running now.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm trying to confirm that it is indeed working by running the serving layer ingestion example.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;PRE&gt;wget --quiet --post-file data.csv --output-document -   --header "Content-Type: text/csv"   http://localhost:8080/ingest&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've taken guidance from this post to update the runtime dependencies for the serving layer. (Updated&amp;nbsp;spark-streaming-kafka-0-8 to&amp;nbsp;spark-streaming-kafka-0-10).&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;A href="https://github.com/OryxProject/oryx/issues/265" target="_blank"&gt;https://github.com/OryxProject/oryx/issues/265&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;However I see two issues with the serving layer. On startup I see the following message repeatedly:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;2017-04-19 09:11:50,199 WARN  NetworkClient:568 Bootstrap broker &amp;lt;kafka-host&amp;gt;:9092 disconnected
2017-04-19 09:11:50,401 WARN  NetworkClient:568 Bootstrap broker &amp;lt;kafka-host&amp;gt;:9092 disconnected&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;BR /&gt;I'm not sure what this means exactly, but apparently it might be something to do with SSL?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When I try the ingest step, I get the following error on the serving layer:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;2017-04-19 09:13:22,271 INFO  OryxApplication:65 Creating JAX-RS from endpoints in package(s) com.cloudera.oryx.app.serving,com.cloudera.oryx.app.serving.als,com.cloudera.oryx.lambda.serving
2017-04-19 09:13:22,437 WARN  NetworkClient:568 Bootstrap broker &amp;lt;kafka-host&amp;gt;:9092 disconnected
2017-04-19 09:13:22,497 INFO  Reflections:232 Reflections took 225 ms to scan 1 urls, producing 17 keys and 106 values
Apr 19, 2017 9:13:22 AM org.apache.catalina.core.ApplicationContext log
SEVERE: StandardWrapper.Throwable
java.lang.NoSuchMethodError: com.google.common.collect.Sets$SetView.iterator()Lcom/google/common/collect/UnmodifiableIterator;
	at org.reflections.Reflections.expandSuperTypes(Reflections.java:380)
	at org.reflections.Reflections.&amp;lt;init&amp;gt;(Reflections.java:126)
	at org.reflections.Reflections.&amp;lt;init&amp;gt;(Reflections.java:168)
	at org.reflections.Reflections.&amp;lt;init&amp;gt;(Reflections.java:141)
	at com.cloudera.oryx.lambda.serving.OryxApplication.doGetClasses(OryxApplication.java:69)
	at com.cloudera.oryx.lambda.serving.OryxApplication.getClasses(OryxApplication.java:57)
	at org.glassfish.jersey.server.ResourceConfig$RuntimeConfig$3.run(ResourceConfig.java:1234)
	at org.glassfish.jersey.internal.Errors$2.call(Errors.java:289)&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This apparently has something to do with the guava versions on the classpath. Although from what I can see, the new runtime dependencies I have added use the same guava version as the existing dependencies.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 19 Apr 2017 13:19:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53850#M18566</guid>
      <dc:creator>olabhrad</dc:creator>
      <dc:date>2017-04-19T13:19:50Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53851#M18567</link>
      <description>&lt;P&gt;You should probably use 2.4.0-SNAPSHOT, but, also use Spark 2.1.0 rather than 2.0.x&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The last error is one I just managed to find and fix today when I produced some other updates. Try again with the very latest code.&lt;/P&gt;</description>
      <pubDate>Wed, 19 Apr 2017 13:24:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53851#M18567</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2017-04-19T13:24:21Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53859#M18568</link>
      <description>&lt;P&gt;I've pulled and rebuilt&amp;nbsp;&lt;SPAN&gt;2.4.0-SNAPSHOT.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;Kafka/Zookepper hosts are using oryx-run.sh and oryx.conf (ALS example)&amp;nbsp;&lt;SPAN&gt;and have been updated to use Spark 2.1.0. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I've updated the Cluster to use&amp;nbsp;Spark 2.1.0 also.&lt;BR /&gt;&lt;BR /&gt;Batch layer is running again.&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Serving layer guava issue is resolved but I am still seeing the broker issue:&lt;/SPAN&gt;&lt;/P&gt;&lt;PRE&gt;NetworkClient:568 Bootstrap broker &amp;lt;kafka-host&amp;gt;:9092 disconnected&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here's some output from when the serving layer starts up:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;PRE&gt;2017-04-19 10:45:31,496 INFO  ConsumerConfig:180 ConsumerConfig values:
	auto.commit.interval.ms = 5000
	auto.offset.reset = earliest
	bootstrap.servers = [&amp;lt;kafka-host&amp;gt;:9092]
	check.crcs = true
	client.id = consumer-1
	connections.max.idle.ms = 540000
	enable.auto.commit = true
	exclude.internal.topics = true
	fetch.max.bytes = 52428800
	fetch.max.wait.ms = 500
	fetch.min.bytes = 1
	group.id = OryxGroup-ServingLayer-1545f926-8dd9-499d-aaff-1e3a709c0645
	heartbeat.interval.ms = 3000
	interceptor.classes = null
	key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
	max.partition.fetch.bytes = 16777216
	max.poll.interval.ms = 300000
	max.poll.records = 500
	metadata.max.age.ms = 300000
	metric.reporters = []
	metrics.num.samples = 2
	metrics.sample.window.ms = 30000
	partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
	receive.buffer.bytes = 65536
	reconnect.backoff.ms = 50
	request.timeout.ms = 305000
	retry.backoff.ms = 100
	sasl.kerberos.kinit.cmd = /usr/bin/kinit
	sasl.kerberos.min.time.before.relogin = 60000
	sasl.kerberos.service.name = null
	sasl.kerberos.ticket.renew.jitter = 0.05
	sasl.kerberos.ticket.renew.window.factor = 0.8
	sasl.mechanism = GSSAPI
	security.protocol = PLAINTEXT
	send.buffer.bytes = 131072
	session.timeout.ms = 10000
	ssl.cipher.suites = null
	ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
	ssl.endpoint.identification.algorithm = null
	ssl.key.password = null
	ssl.keymanager.algorithm = SunX509
	ssl.keystore.location = null
	ssl.keystore.password = null
	ssl.keystore.type = JKS
	ssl.protocol = TLS
	ssl.provider = null
	ssl.secure.random.implementation = null
	ssl.trustmanager.algorithm = PKIX
	ssl.truststore.location = null
	ssl.truststore.password = null
	ssl.truststore.type = JKS
	value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer&lt;/PRE&gt;&lt;PRE&gt;2017-04-19 10:45:54,297 INFO  ProducerConfig:180 ProducerConfig values:
	acks = 1
	batch.size = 16384
	block.on.buffer.full = false
	bootstrap.servers = [&amp;lt;kafka-host&amp;gt;:9092]
	buffer.memory = 33554432
	client.id = producer-1
	compression.type = gzip
	connections.max.idle.ms = 540000
	interceptor.classes = null
	key.serializer = class org.apache.kafka.common.serialization.StringSerializer
	linger.ms = 1000
	max.block.ms = 60000
	max.in.flight.requests.per.connection = 5
	max.request.size = 67108864
	metadata.fetch.timeout.ms = 60000
	metadata.max.age.ms = 300000
	metric.reporters = []
	metrics.num.samples = 2
	metrics.sample.window.ms = 30000
	partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
	receive.buffer.bytes = 32768
	reconnect.backoff.ms = 50
	request.timeout.ms = 30000
	retries = 0
	retry.backoff.ms = 100
	sasl.kerberos.kinit.cmd = /usr/bin/kinit
	sasl.kerberos.min.time.before.relogin = 60000
	sasl.kerberos.service.name = null
	sasl.kerberos.ticket.renew.jitter = 0.05
	sasl.kerberos.ticket.renew.window.factor = 0.8
	sasl.mechanism = GSSAPI
	security.protocol = PLAINTEXT
	send.buffer.bytes = 131072
	ssl.cipher.suites = null
	ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
	ssl.endpoint.identification.algorithm = null
	ssl.key.password = null
	ssl.keymanager.algorithm = SunX509
	ssl.keystore.location = null
	ssl.keystore.password = null
	ssl.keystore.type = JKS
	ssl.protocol = TLS
	ssl.provider = null
	ssl.secure.random.implementation = null
	ssl.trustmanager.algorithm = PKIX
	ssl.truststore.location = null
	ssl.truststore.password = null
	ssl.truststore.type = JKS
	timeout.ms = 30000
	value.serializer = class org.apache.kafka.common.serialization.StringSerializer&lt;/PRE&gt;&lt;P&gt;When I run the ingest call, nothing seems to happen. I don't see any extra ouput on the servling layer or on the topics.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 19 Apr 2017 15:28:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53859#M18568</guid>
      <dc:creator>olabhrad</dc:creator>
      <dc:date>2017-04-19T15:28:21Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53860#M18569</link>
      <description>&lt;P&gt;Correction to my last post:&lt;BR /&gt;&lt;BR /&gt;The&amp;nbsp;ProducerConfig only shows in the servling layer ouput once the ingest is invoked.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I also see the following:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;Apr 19, 2017 12:04:01 PM org.apache.catalina.util.SessionIdGeneratorBase createSecureRandom
INFO: Creation of SecureRandom instance for session ID generation using [SHA1PRNG] took [34,430] milliseconds.
2017-04-19 12:04:01,939 WARN  NetworkClient:568 Bootstrap broker &amp;lt;kafka-host&amp;gt;:9092 disconnected
2017-04-19 12:04:02,040 INFO  OryxApplication:65 Creating JAX-RS from endpoints in package(s) com.cloudera.oryx.app.serving,com.cloudera.oryx.app.serving.als,com.cloudera.oryx.lambda.serving
2017-04-19 12:04:02,148 WARN  NetworkClient:568 Bootstrap broker &amp;lt;kafka-host&amp;gt;:9092 disconnected
2017-04-19 12:04:02,354 WARN  NetworkClient:568 Bootstrap broker &amp;lt;kafka-host&amp;gt;:9092 disconnected
2017-04-19 12:04:02,413 INFO  Reflections:229 Reflections took 345 ms to scan 1 urls, producing 17 keys and 106 values
2017-04-19 12:04:02,572 WARN  NetworkClient:568 Bootstrap broker &amp;lt;kafka-host&amp;gt;:9092 disconnected
2017-04-19 12:04:02,681 INFO  Reflections:229 Reflections took 219 ms to scan 1 urls, producing 10 keys and 64 values
2017-04-19 12:04:02,781 WARN  NetworkClient:568 Bootstrap broker &amp;lt;kafka-host&amp;gt;:9092 disconnected
2017-04-19 12:04:02,831 INFO  Reflections:229 Reflections took 145 ms to scan 1 urls, producing 13 keys and 14 values
2017-04-19 12:04:02,998 WARN  NetworkClient:568 Bootstrap broker &amp;lt;kafka-host&amp;gt;:9092 disconnected
2017-04-19 12:04:03,216 WARN  NetworkClient:568 Bootstrap broker &amp;lt;kafka-host&amp;gt;:9092 disconnected
2017-04-19 12:04:03,439 WARN  NetworkClient:568 Bootstrap broker &amp;lt;kafka-host&amp;gt;:9092 disconnected
Apr 19, 2017 12:04:03 PM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler ["http-nio2-8080"]&lt;/PRE&gt;</description>
      <pubDate>Wed, 19 Apr 2017 16:06:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53860#M18569</guid>
      <dc:creator>olabhrad</dc:creator>
      <dc:date>2017-04-19T16:06:37Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53866#M18570</link>
      <description>&lt;P&gt;I solved this!&lt;BR /&gt;&lt;BR /&gt;The issue was that my kafka/zookeeper hosts and my cluster were using&amp;nbsp;kafka_2.11-0.10.0.0&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;2.4.0-SNAPSHOT is using&amp;nbsp;0.10.1.1&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've updated to use&amp;nbsp;&lt;SPAN&gt;kafka_2.11-0.10.1.1&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 19 Apr 2017 16:44:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53866#M18570</guid>
      <dc:creator>olabhrad</dc:creator>
      <dc:date>2017-04-19T16:44:29Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53872#M18571</link>
      <description>&lt;P&gt;Hm, I don't recall seeing the 'disconnected' message. Is there more detail?&lt;/P&gt;&lt;P&gt;On its face it seems like the serving layer can't see the broker? do some ports need to be opened?&lt;/P&gt;</description>
      <pubDate>Wed, 19 Apr 2017 17:55:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53872#M18571</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2017-04-19T17:55:26Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53912#M18572</link>
      <description>&lt;P&gt;I mentioned above the reason for the&amp;nbsp;&lt;SPAN&gt;'disconnected' message. It is resolved now. There was no more detail but I found this similar issue which had me look into the kafka versions:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="http://stackoverflow.com/questions/42851834/apache-kafka-producer-networkclient-broker-server-disconnected" target="_blank"&gt;http://stackoverflow.com/questions/42851834/apache-kafka-producer-networkclient-broker-server-disconnected&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 20 Apr 2017 09:36:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53912#M18572</guid>
      <dc:creator>olabhrad</dc:creator>
      <dc:date>2017-04-20T09:36:40Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53915#M18573</link>
      <description>&lt;P&gt;OK, is it largely working then? If it looks like the app is running, then I'll move to test 2.4 on my cluster too and if it looks good, go ahead and cut a release.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Apr 2017 10:09:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53915#M18573</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2017-04-20T10:09:11Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53916#M18574</link>
      <description>&lt;P&gt;The batch and serving layers look good. I'm testing out the speed layer now.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here's a question. As per the&amp;nbsp;architectural description (&lt;A href="http://oryx.io/index.html" target="_blank"&gt;http://oryx.io/index.html&lt;/A&gt;), I see that historical data is stored in HDFS by the batch layer. The speed and serving layer only seem to interact with the kafka topics for input and updates.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So my question is, is it acceptable for the batch and speed layers to be actually running on separate hadoop clusters? They are configured to use the same kafka brokers. The reason I ask this is that AWS EMR clusters only allow adding "steps" that are run in sequential order. So my speed layer would never actually be launched on the batch layer cluster unless the batch layer spark job was killed or stopped.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Also, does the serving layer interact with HDFS some way? I see that the hadoop dependencies are needed for the jar but I was under the impression in only interacted with the kafka topics.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 20 Apr 2017 10:23:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53916#M18574</guid>
      <dc:creator>olabhrad</dc:creator>
      <dc:date>2017-04-20T10:23:22Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53917#M18575</link>
      <description>&lt;P&gt;Yes, they're all only coupled by Kafka, so you could run these layers quite separately except that they need to share the brokers.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It probably won't fit EMR's model as both should run concurrently, and, should run continuously. I'm not sure if it can help you with a shared Kafka either.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Obviously it's also an option to run CDH on AWS if you want to try that.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Serving layer does not _generally_ use HDFS unless the model is so big that Kafka can't represent parts of it. Then it will write to HDFS and read from it. This really isn't great but it's the best I could do for now for really large models. This is something that could be improved at some point, I hope. If you tune Kafka to allow very large models you can get away without HDFS access.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Apr 2017 10:26:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53917#M18575</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2017-04-20T10:26:55Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53923#M18576</link>
      <description>&lt;P&gt;I can confirm that the speed layer is working as expected now.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My current setup is that I have two separate EC2 instances and two seperate EMR clusters. One EC2 instance is running Kafka, while the other is running Zookeeper and an instance of the Serving layer. (This serving layer currently doesn't have any access to HDFS, so I will probably&amp;nbsp;see some errors if the models get too large).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;One EMR cluster is running the Batch layer and the other is running the Speed layer. This approach seems to working fine for now. My Batch layer is currently even writing the output to S3. I didn't expect for this to work straight out of the box, but it did. I just updated the oryx.conf and I guess Amazon's implementation of HDFS (EMRFS) takes care of the rest.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;hdfs-base = "s3://mybucket/Oryx"&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;Do you see any issues with this setup? (Apart from the Serving layer and HDFS access)&lt;/P&gt;</description>
      <pubDate>Thu, 20 Apr 2017 13:26:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53923#M18576</guid>
      <dc:creator>olabhrad</dc:creator>
      <dc:date>2017-04-20T13:26:20Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53925#M18577</link>
      <description>&lt;P&gt;Although I haven't tested anything like that, it's just using really standard APIs in straightforward ways, so, I'm not surprised if S3 just works because HDFS can read/write S3 OK. I know there are some gotchas with actually using S3 as intermediate storage in Spark jobs, but I think your EMR jobs are using local HDFS for that.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Apr 2017 13:38:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53925#M18577</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2017-04-20T13:38:50Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53932#M18578</link>
      <description>&lt;P&gt;I just spotted an issue with using S3.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;java.lang.IllegalArgumentException: Wrong FS: s3://mybucket/Oryx/data/oryx-1492697400000.data, expected: hdfs://&amp;lt;master-node&amp;gt;:8020
	at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:653)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:194)
	at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:106)
	at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
	at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1430)
	at com.cloudera.oryx.lambda.batch.SaveToHDFSFunction.call(SaveToHDFSFunction.java:71)
	at com.cloudera.oryx.lambda.batch.SaveToHDFSFunction.call(SaveToHDFSFunction.java:35)&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;It's strange that it worked a few times and then crashed on this attempt.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Anyway, following guidance from this post:&lt;BR /&gt;&lt;A href="https://forums.aws.amazon.com/thread.jspa?threadID=30945" target="_blank"&gt;https://forums.aws.amazon.com/thread.jspa?threadID=30945&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I updated&amp;nbsp;com.cloudera.oryx.lambda.batch.SaveToHDFSFunction&lt;BR /&gt;&lt;BR /&gt;from:&lt;/P&gt;&lt;PRE&gt;FileSystem fs = FileSystem.get(hadoopConf);&lt;/PRE&gt;&lt;P&gt;to&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;FileSystem fs = FileSystem.get(path.toUri(), hadoopConf);&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;This seems to have done the trick.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Apr 2017 15:12:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53932#M18578</guid>
      <dc:creator>olabhrad</dc:creator>
      <dc:date>2017-04-20T15:12:22Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53933#M18579</link>
      <description>&lt;P&gt;Yes, good catch. I'll track that at&amp;nbsp;&lt;A href="https://github.com/OryxProject/oryx/issues/329" target="_blank"&gt;https://github.com/OryxProject/oryx/issues/329&lt;/A&gt; and fix it in a few minutes.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Apr 2017 15:18:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53933#M18579</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2017-04-20T15:18:04Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53996#M18580</link>
      <description>&lt;P&gt;Oh, now I see the same 'disconnected' problem you did.&lt;/P&gt;&lt;P&gt;It turns out that Kafka 0.10.0 and 0.10.1 are not protocol-compatible, which is quite disappointing.&lt;/P&gt;&lt;P&gt;So I think I'm going to have to back up and revert master/2.4 to Kafka 0.10.0, because that's the flavor that CDH is on and would like to avoid having two builds to support 0.10.0 vs 0.10.1. I hope that isn't a big deal to switch back in your prototype?&lt;/P&gt;</description>
      <pubDate>Fri, 21 Apr 2017 15:16:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53996#M18580</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2017-04-21T15:16:33Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53997#M18581</link>
      <description>&lt;P&gt;No problem, I'll get around to testing that out on Monday.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 21 Apr 2017 15:46:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/53997#M18581</guid>
      <dc:creator>olabhrad</dc:creator>
      <dc:date>2017-04-21T15:46:55Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/54041#M18582</link>
      <description>&lt;P&gt;Hi, I can confirm my setup is working with 0.10.0.0.&lt;BR /&gt;&lt;BR /&gt;I noticed one issue in the serving layer output (I noticed this before I made the change, so it is not new).&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;PRE&gt;2017-04-24 16:26:54,052 INFO  ALSServingModelManager:96 ALSServingModel[features:10, implicit:true, X:(877 users), Y:(1639 items, partitions: [0:1296, 1:343]...), fractionLoaded:1.0]
2017-04-24 16:26:54,053 INFO  SolverCache:78 Computing cached solver
2017-04-24 16:26:54,111 INFO  SolverCache:83 Computed new solver null&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;Shoud there be a null value in this last message output?&lt;BR /&gt;&lt;BR /&gt;here's the code in question&amp;nbsp;com.cloudera.oryx.app.als.SolverCache&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;PRE&gt;          if (newYTYSolver != null) {
            log.info("Computed new solver {}", solver);
            solver.set(newYTYSolver);
          }&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 24 Apr 2017 16:34:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/54041#M18582</guid>
      <dc:creator>olabhrad</dc:creator>
      <dc:date>2017-04-24T16:34:41Z</dc:date>
    </item>
    <item>
      <title>Re: Oryx2 Kafka Broker Issue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/54042#M18583</link>
      <description>&lt;P&gt;(BTW I went ahead and made a 2.4.0 release to have something official and probably-working out there. It worked on my CDH 5.11 &amp;nbsp;+ Spark 2.1 + Kafka 0.10.0 cluster.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Yes that's a minor problem in the log message. It should reference newYTYSolver. I'll fix that but it shouldn't otherwise affect anything.&lt;/P&gt;</description>
      <pubDate>Mon, 24 Apr 2017 16:40:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Oryx2-Kafka-Broker-Issue/m-p/54042#M18583</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2017-04-24T16:40:17Z</dc:date>
    </item>
  </channel>
</rss>

