Created on 03-02-2018 04:05 AM - edited 09-16-2022 05:55 AM
I cannot properly connect to Kudu from Spark, error says "Kudu master has no leader"
When I use Impala in HUE to create and query kudu tables, it works flawlessly.
However, connecting from Spark throws some errors I cannot decipher.
I have tried using both pyspark and spark-shell. With spark shell I had to use spark 1.6 instead of 2.2 because some maven dependencies problems, that I have localized but not been able to fix. More info here.
Case 1: using pyspark2 (Spark 2.2.0)
$ pyspark2 --master yarn --jars /opt/cloudera/parcels/CDH-5.14.0-1.cdh5.14.0.p0.24/lib/kudu/kudu-spark2_2.11.jar
> df = sqlContext.read.format('org.apache.kudu.spark.kudu').options(**{"kudu.master":"172.17.0.43:7077", "kudu.table":"impala::default.test"}).load()
18/03/02 10:23:27 WARN client.ConnectToCluster: Error receiving response from 172.17.0.43:7077 org.apache.kudu.client.RecoverableException: [peer master-172.17.0.43:7077] encountered a read timeout; closing the channel at org.apache.kudu.client.Connection.exceptionCaught(Connection.java:412) at org.apache.kudu.shaded.org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112) at org.apache.kudu.client.Connection.handleUpstream(Connection.java:239) at org.apache.kudu.shaded.org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.apache.kudu.shaded.org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.apache.kudu.shaded.org.jboss.netty.channel.SimpleChannelUpstreamHandler.exceptionCaught(SimpleChannelUpstreamHandler.java:153) at org.apache.kudu.shaded.org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112) at org.apache.kudu.shaded.org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.apache.kudu.shaded.org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.apache.kudu.shaded.org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:536) at org.apache.kudu.shaded.org.jboss.netty.handler.timeout.ReadTimeoutHandler.readTimedOut(ReadTimeoutHandler.java:236) at org.apache.kudu.shaded.org.jboss.netty.handler.timeout.ReadTimeoutHandler$ReadTimeoutTask$1.run(ReadTimeoutHandler.java:276) at org.apache.kudu.shaded.org.jboss.netty.channel.socket.ChannelRunnableWrapper.run(ChannelRunnableWrapper.java:40) at org.apache.kudu.shaded.org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:391) at org.apache.kudu.shaded.org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:315) at org.apache.kudu.shaded.org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) at org.apache.kudu.shaded.org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at org.apache.kudu.shaded.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.apache.kudu.shaded.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.kudu.shaded.org.jboss.netty.handler.timeout.ReadTimeoutException at org.apache.kudu.shaded.org.jboss.netty.handler.timeout.ReadTimeoutHandler.<clinit>(ReadTimeoutHandler.java:84) at org.apache.kudu.client.Connection$ConnectionPipeline.init(Connection.java:782) at org.apache.kudu.client.Connection.<init>(Connection.java:199) at org.apache.kudu.client.ConnectionCache.getConnection(ConnectionCache.java:133) at org.apache.kudu.client.AsyncKuduClient.newRpcProxy(AsyncKuduClient.java:248) at org.apache.kudu.client.AsyncKuduClient.newMasterRpcProxy(AsyncKuduClient.java:272) at org.apache.kudu.client.ConnectToCluster.run(ConnectToCluster.java:157) at org.apache.kudu.client.AsyncKuduClient.getMasterTableLocationsPB(AsyncKuduClient.java:1350) at org.apache.kudu.client.AsyncKuduClient.exportAuthenticationCredentials(AsyncKuduClient.java:651) at org.apache.kudu.client.KuduClient.exportAuthenticationCredentials(KuduClient.java:293) at org.apache.kudu.spark.kudu.KuduContext$$anon$1.run(KuduContext.scala:97) at org.apache.kudu.spark.kudu.KuduContext$$anon$1.run(KuduContext.scala:96) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.kudu.spark.kudu.KuduContext.<init>(KuduContext.scala:96) at org.apache.kudu.spark.kudu.KuduRelation.<init>(DefaultSource.scala:162) at org.apache.kudu.spark.kudu.DefaultSource.createRelation(DefaultSource.scala:75) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:280) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:214) ... 1 more 18/03/02 10:23:27 WARN client.ConnectToCluster: Unable to find the leader master 172.17.0.43:7077; will retry Py4JJavaError Traceback (most recent call last) <ipython-input-1-e1dfaec7a544> in <module>() ----> 1 df = sqlContext.read.format('org.apache.kudu.spark.kudu').options(**{"kudu.master":"172.17.0.43:7077", "kudu.table":"impala::default.logika_dataset_kudu"}).load() /opt/cloudera/parcels/SPARK2-2.2.0.cloudera2-1.cdh5.12.0.p0.232957/lib/spark2/python/pyspark/sql/readwriter.py in load(self, path, format, schema, **options) 163 return self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path))) 164 else: --> 165 return self._df(self._jreader.load()) 166 167 @since(1.4) /opt/cloudera/parcels/SPARK2-2.2.0.cloudera2-1.cdh5.12.0.p0.232957/lib/spark2/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py in __call__(self, *args) 1131 answer = self.gateway_client.send_command(command) 1132 return_value = get_return_value( -> 1133 answer, self.gateway_client, self.target_id, self.name) 1134 1135 for temp_arg in temp_args: /opt/cloudera/parcels/SPARK2-2.2.0.cloudera2-1.cdh5.12.0.p0.232957/lib/spark2/python/pyspark/sql/utils.py in deco(*a, **kw) 61 def deco(*a, **kw): 62 try: ---> 63 return f(*a, **kw) 64 except py4j.protocol.Py4JJavaError as e: 65 s = e.java_exception.toString() /opt/cloudera/parcels/SPARK2-2.2.0.cloudera2-1.cdh5.12.0.p0.232957/lib/spark2/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name) 317 raise Py4JJavaError( 318 "An error occurred while calling {0}{1}{2}.\n". --> 319 format(target_id, ".", name), value) 320 else: 321 raise Py4JError( Py4JJavaError: An error occurred while calling o59.load. : java.security.PrivilegedActionException: org.apache.kudu.client.NoLeaderFoundException: Master config (172.17.0.43:7077) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: [peer master-172.17.0.43:7077] encountered a read timeout; closing the channel at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.kudu.spark.kudu.KuduContext.<init>(KuduContext.scala:96) at org.apache.kudu.spark.kudu.KuduRelation.<init>(DefaultSource.scala:162) at org.apache.kudu.spark.kudu.DefaultSource.createRelation(DefaultSource.scala:75) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:280) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:214) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.kudu.client.NoLeaderFoundException: Master config (172.17.0.43:7077) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: [peer master-172.17.0.43:7077] encountered a read timeout; closing the channel at org.apache.kudu.client.ConnectToCluster.incrementCountAndCheckExhausted(ConnectToCluster.java:272) at org.apache.kudu.client.ConnectToCluster.access$100(ConnectToCluster.java:49) at org.apache.kudu.client.ConnectToCluster$ConnectToMasterErrCB.call(ConnectToCluster.java:349) at org.apache.kudu.client.ConnectToCluster$ConnectToMasterErrCB.call(ConnectToCluster.java:338) at com.stumbleupon.async.Deferred.doCall(Deferred.java:1280) at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1259) at com.stumbleupon.async.Deferred.handleContinuation(Deferred.java:1315) at com.stumbleupon.async.Deferred.doCall(Deferred.java:1286) at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1259) at com.stumbleupon.async.Deferred.callback(Deferred.java:1002) at org.apache.kudu.client.KuduRpc.handleCallback(KuduRpc.java:238) at org.apache.kudu.client.KuduRpc.errback(KuduRpc.java:292) at org.apache.kudu.client.RpcProxy.failOrRetryRpc(RpcProxy.java:388) at org.apache.kudu.client.RpcProxy.responseReceived(RpcProxy.java:217) at org.apache.kudu.client.RpcProxy.access$000(RpcProxy.java:60) at org.apache.kudu.client.RpcProxy$1.call(RpcProxy.java:132) at org.apache.kudu.client.RpcProxy$1.call(RpcProxy.java:128) at org.apache.kudu.client.Connection.cleanup(Connection.java:694) at org.apache.kudu.client.Connection.exceptionCaught(Connection.java:439) at org.apache.kudu.shaded.org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112) at org.apache.kudu.client.Connection.handleUpstream(Connection.java:239) at org.apache.kudu.shaded.org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.apache.kudu.shaded.org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.apache.kudu.shaded.org.jboss.netty.channel.SimpleChannelUpstreamHandler.exceptionCaught(SimpleChannelUpstreamHandler.java:153) at org.apache.kudu.shaded.org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112) at org.apache.kudu.shaded.org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.apache.kudu.shaded.org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.apache.kudu.shaded.org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:536) at org.apache.kudu.shaded.org.jboss.netty.handler.timeout.ReadTimeoutHandler.readTimedOut(ReadTimeoutHandler.java:236) at org.apache.kudu.shaded.org.jboss.netty.handler.timeout.ReadTimeoutHandler$ReadTimeoutTask$1.run(ReadTimeoutHandler.java:276) at org.apache.kudu.shaded.org.jboss.netty.channel.socket.ChannelRunnableWrapper.run(ChannelRunnableWrapper.java:40) at org.apache.kudu.shaded.org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:391) at org.apache.kudu.shaded.org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:315) at org.apache.kudu.shaded.org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) at org.apache.kudu.shaded.org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at org.apache.kudu.shaded.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.apache.kudu.shaded.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ... 1 more Caused by: org.apache.kudu.client.RecoverableException: [peer master-172.17.0.43:7077] encountered a read timeout; closing the channel at org.apache.kudu.client.Connection.exceptionCaught(Connection.java:412) ... 21 more Caused by: org.apache.kudu.shaded.org.jboss.netty.handler.timeout.ReadTimeoutException at org.apache.kudu.shaded.org.jboss.netty.handler.timeout.ReadTimeoutHandler.<clinit>(ReadTimeoutHandler.java:84) at org.apache.kudu.client.Connection$ConnectionPipeline.init(Connection.java:782) at org.apache.kudu.client.Connection.<init>(Connection.java:199) at org.apache.kudu.client.ConnectionCache.getConnection(ConnectionCache.java:133) at org.apache.kudu.client.AsyncKuduClient.newRpcProxy(AsyncKuduClient.java:248) at org.apache.kudu.client.AsyncKuduClient.newMasterRpcProxy(AsyncKuduClient.java:272) at org.apache.kudu.client.ConnectToCluster.run(ConnectToCluster.java:157) at org.apache.kudu.client.AsyncKuduClient.getMasterTableLocationsPB(AsyncKuduClient.java:1350) at org.apache.kudu.client.AsyncKuduClient.exportAuthenticationCredentials(AsyncKuduClient.java:651) at org.apache.kudu.client.KuduClient.exportAuthenticationCredentials(KuduClient.java:293) at org.apache.kudu.spark.kudu.KuduContext$$anon$1.run(KuduContext.scala:97) at org.apache.kudu.spark.kudu.KuduContext$$anon$1.run(KuduContext.scala:96) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.kudu.spark.kudu.KuduContext.<init>(KuduContext.scala:96) at org.apache.kudu.spark.kudu.KuduRelation.<init>(DefaultSource.scala:162) at org.apache.kudu.spark.kudu.DefaultSource.createRelation(DefaultSource.scala:75) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:280) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:214) ... 1 more
Case 2: using spark-shell (Spark 1.6.0 standalone):
$ spark-shell --master spark://localhost:7077 --packages org.apache.kudu:kudu-spark_2.10:1.1.0
> import org.apache.kudu.spark.kudu._ > import org.apache.kudu.client._ > import collection.JavaConverters._ > val df = sqlContext.read.options(Map("kudu.master" -> "localhost:7051","kudu.table" -> "impala::default.test")).kudu
df: org.apache.spark.sql.DataFrame = [dataset: string, id: string, itemnumber: string, srcid: string, timestamp: string, year: string, month: string, day: string, week: string, quarter: string, season: string, city: string, region1: string, region2: string, region3: string, region4: string, locality: string, itemname: string, itembqu: string, product_category: string, amount: string, mapped_zipcode: string, latitude: string, longitude: string, depositor_code: string, depositor_name: string, customer_code: string, is_island: string]
It seems to be connecting, as it is able to show the column names, but if I
// register a temporary table and use SQL df.registerTempTable("test") val filteredDF = sqlContext.sql("select count(*) from test").show()
bang!
[Stage 0:> (0 + 6) / 6] Lost task 1.0 in stage 0.0 (TID 1, tt-slave-2.novalocal, executor 1): org.apache.kudu.client.NonRecoverableException: RPC can not complete before timeout: KuduRpc(method=GetTableSchema, tablet=null, attempt=30, DeadlineTracker(timeout=30000, elapsed=27307), Traces:
[0ms] querying master,
[48ms] Sub rpc: GetMasterRegistration sending RPC to server Kudu Master - localhost:7051,
[71ms] Sub rpc: GetMasterRegistration received from server Kudu Master - localhost:7051 response
Network error:
[Peer Kudu Master - localhost:7051] Connection reset,
[75ms] delaying RPC due to Service unavailable: Master config (localhost:7051) has no leader.
Exceptions received: org.apache.kudu.client.RecoverableException:
[Peer Kudu Master - localhost:7051] Connection reset,
...
(SAME MESSAGE REPEATS 25 TIMES)
...
[24262ms] querying master,
[24262ms] Sub rpc: GetMasterRegistration sending RPC to server Kudu Master - localhost:7051,
[24263ms] Sub rpc: GetMasterRegistration received from server Kudu Master - localhost:7051 response
Network error:
[Peer Kudu Master - localhost:7051] Connection reset,
[24263ms] delaying RPC due to Service unavailable: Master config (localhost:7051) has no leader.
Exceptions received: org.apache.kudu.client.RecoverableException:
[Peer Kudu Master - localhost:7051] Connection reset,
[24661ms] trace too long, truncated)
at org.apache.kudu.client.AsyncKuduClient.tooManyAttemptsOrTimeout(AsyncKuduClient.java:961) at org.apache.kudu.client.AsyncKuduClient.delayedSendRpcToTablet(AsyncKuduClient.java:1203) at org.apache.kudu.client.AsyncKuduClient.access$800(AsyncKuduClient.java:110) at org.apache.kudu.client.AsyncKuduClient$RetryRpcErrback.call(AsyncKuduClient.java:764) at org.apache.kudu.client.AsyncKuduClient$RetryRpcErrback.call(AsyncKuduClient.java:754) at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257) at com.stumbleupon.async.Deferred.callback(Deferred.java:1005) at org.apache.kudu.client.GetMasterRegistrationReceived.incrementCountAndCheckExhausted(GetMasterRegistrationReceived.java:156) at org.apache.kudu.client.GetMasterRegistrationReceived.access$300(GetMasterRegistrationReceived.java:45) at org.apache.kudu.client.GetMasterRegistrationReceived$GetMasterRegistrationErrCB.call(GetMasterRegistrationReceived.java:236) at org.apache.kudu.client.GetMasterRegistrationReceived$GetMasterRegistrationErrCB.call(GetMasterRegistrationReceived.java:225) at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257) at com.stumbleupon.async.Deferred.callback(Deferred.java:1005) at org.apache.kudu.client.KuduRpc.handleCallback(KuduRpc.java:220) at org.apache.kudu.client.KuduRpc.errback(KuduRpc.java:274) at org.apache.kudu.client.TabletClient.failOrRetryRpc(TabletClient.java:770) at org.apache.kudu.client.TabletClient.failOrRetryRpcs(TabletClient.java:747) at org.apache.kudu.client.TabletClient.cleanup(TabletClient.java:736) at org.apache.kudu.client.TabletClient.channelClosed(TabletClient.java:698) at org.apache.kudu.client.shaded.org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:88) at org.apache.kudu.client.TabletClient.handleUpstream(TabletClient.java:679) at org.apache.kudu.client.shaded.org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.apache.kudu.client.shaded.org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.apache.kudu.client.shaded.org.jboss.netty.handler.timeout.ReadTimeoutHandler.channelClosed(ReadTimeoutHandler.java:176) at org.apache.kudu.client.shaded.org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:88) at org.apache.kudu.client.shaded.org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.apache.kudu.client.shaded.org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) at org.apache.kudu.client.shaded.org.jboss.netty.channel.Channels.fireChannelClosed(Channels.java:468) at org.apache.kudu.client.shaded.org.jboss.netty.channel.Channels$6.run(Channels.java:457) at org.apache.kudu.client.shaded.org.jboss.netty.channel.socket.ChannelRunnableWrapper.run(ChannelRunnableWrapper.java:40) at org.apache.kudu.client.shaded.org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:391) at org.apache.kudu.client.shaded.org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:315) at org.apache.kudu.client.shaded.org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) at org.apache.kudu.client.shaded.org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at org.apache.kudu.client.shaded.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.apache.kudu.client.shaded.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.kudu.client.NoLeaderFoundException: Master config (localhost:7051) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: [Peer Kudu Master - localhost:7051] Connection reset at org.apache.kudu.client.GetMasterRegistrationReceived.incrementCountAndCheckExhausted(GetMasterRegistrationReceived.java:154) ... 32 more Caused by: org.apache.kudu.client.RecoverableException: [Peer Kudu Master - localhost:7051] Connection reset at org.apache.kudu.client.TabletClient.cleanup(TabletClient.java:734) ... 21 more
As I said, Kudu service is up an running, and I am able to query kudu tables from Hue using Impala.
What am I missing here? Is this the right approach to interfacing Spark with Kudu?
Thanks
Created 03-05-2018 12:54 PM
Hi,
How many masters do you have in your cluster?
I looked at the error from Case2 a bit and it might happen that the client does not have proper list of Kudu master endpoints (if it's indeed the multi-master case). If you have just one master in your Kudu cluster, then it makes sense to look into the master's log on that single machine -- it might give you some hints why that single master could not declare itself a leader.
Thanks,
Alexey
Created 03-05-2018 01:20 PM
Ah, just noticed one strange thing in the Case1: the port number for the Kudu master specified is 7077. Is that correct? If you didn't tweak Kudu's configuration, then Kudu masters are listening for RPC at port 7051 by default.
Also, comparing Case1 with Case2, it seems Case2 is using 'localhost' for the address of the master, while in the Case1 IP address 172.17.0.43 is used. So, for the Case2, if you are running that shell not from the machine where your master is running (I assume that's a machine with 172.17.0.43 IP address configured at one of its interfaces), then you need to specify not 'localhost', but the name/IP of the machine where your Kudu master is running. And of cource, in case of multi-master Kudu cluster, you need to have endpoints of all Kudu masters for the "kudu.master" property.
Hope this helps,
Alexey
Created 03-07-2018 02:05 AM
Thank you Alexey, the port in case 1 was endeed the problem. I don't know where I got the 7077, I guess **bleep**ching back and forth form Kudu to Spark addresses got them mixed up.
In case 2, I am working the same machine but strangely, changing the ip from localhost to 172.17.0.43 made it work.
Thanks again!