04-20-2017 07:15 PM
I want to query spark on hadoop(hdfs) but I'm newbie for it, I try to set sparksession.master "spark://157.179.xx.xxx:7077"
My code:
SparkSession spark = SparkSession.builder().appName("Sql Simple Spark").master("spark://157.179.xx.xxx:7077") .config("spark.sql.warehouse.dir", "file:///D:/Misc/SparkTest/spark-warehouse").getOrCreate();
Error on console:
org.apache.spark.SparkException: Exception thrown in awaitResult at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77) at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83) at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:88) at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:96) at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:109) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.io.IOException: Failed to connect to /157.179.xx.xxx:7077 at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179) at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197) at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:191) at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187) ... 4 more Caused by: java.net.ConnectException: Connection refused: no further information: /157.179.xx.xxx:7077 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source) at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) ... 1 more
P.S. I can use command spark-shell on linux and sqlContext.sql("show tables").show, show my tables same as Hue
04-21-2017 12:40 AM
You do not set a master in your application code, but let spark-submit set it.
You also won't want to use a spark: URL on CDH because that's for standalone mode.
04-21-2017 12:52 AM - edited 04-21-2017 01:45 AM
Thanks for feedback, I use Hadoop 2.6.0-cdh5.10.1 and Spark version is 1.6.0, Can I upgrade Spark to 2.1 and how to
My situation: I use JDBC Impala to query any data in Hue and display results on webpage(portlet), now I need to change Impala to Spark, is it possible?