Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Issue writing to Hive using Zeppelin Spark R - HDP 2.6.1

Issue writing to Hive using Zeppelin Spark R - HDP 2.6.1

New Contributor

We can query Hive just fine via Zeppelin using Spark R, but when trying to use saveAsTable, like this:

%spark.r
sqlContext <- sparkRHive.init(sc)
mlbresults4 <- sql(sqlContext,"select * from z_boydmak.testorder")
head(mlbresults4)
saveAsTable(mlbresults4,"z_boydmak.hivewritetest10",,"overwrite")

We get the following error:

Error in saveAsTable(mlbresults4, "z_boydmak.hivewritetest10", , "overwrite"): sparkRHive or sparkRSQL context has to be specified

As you can see, we did create a context using sparkRHive, so I'm assuming that we have something configured with the Spark interpreter incorrectly. The above code works from the sparkR command-line tool, just not from Zeppelin. Any help is greatly appreciated.

3 REPLIES 3

Re: Issue writing to Hive using Zeppelin Spark R - HDP 2.6.1

@Mark Boyd Have you tried sparkR using --master yarn-client from same host where zepplin server is running? If not please do so as perhaps it could help troubleshooting this issue. Another option is to check the zeppelin spark R interpreter log under /var/log/zeppelin.

HTH

Re: Issue writing to Hive using Zeppelin Spark R - HDP 2.6.1

New Contributor

Hi @Felix Albani - thank you for the quick response. I did run the sparkR command-line with --master yarn-client (the way the Zeppeling interpreter is configured) and it does work. It's only in Zeppelin that I'm getting the error. In looking at the interpreter log, I do see these messages:

WARN [2018-07-10 08:26:53,420] ({nioEventLoopGroup-2-2} Logging.scala[logWarning]:70) - cannot find matching method class org.apache.spark.sql.api.r.SQLUtils.createSQLContext. Candidates are:
WARN [2018-07-10 08:26:53,421] ({nioEventLoopGroup-2-2} Logging.scala[logWarning]:70) - createSQLContext(class org.apache.spark.api.java.JavaSparkContext)
ERROR [2018-07-10 08:26:53,421] ({nioEventLoopGroup-2-2} Logging.scala[logError]:74) - createSQLContext on org.apache.spark.sql.api.r.SQLUtils failed
ERROR [2018-07-10 08:42:47,591] ({nioEventLoopGroup-2-2} Logging.scala[logError]:74) - sc on 1 failed

I'll do some more research on this particular error message. Thanks.

Mark

Re: Issue writing to Hive using Zeppelin Spark R - HDP 2.6.1

New Contributor

Okay, I kind of got this working - I had to manually create the sparkContext and sqlContext both and it is now creating the new table. The issue now is that the Note "hangs" and waits for the session timeout to be reached. I'm not sure why I have to manually create both contexts, as I thought that they were created automatically. A simple %spark.r ls() shows them being available and I do have zeppelin.spark.useHiveContext=true for the interpreter. Any ideas on the "hanging" and why I have to create the contexts explicitly? Here's the "working" code:

%spark.r

sc <- sparkR.init()
sqlContext <- sparkRHive.init(sc)
mlbresults4 <- sql(sqlContext,"select * from z_boydmak.testorder")
head(mlbresults4)
saveAsTable(mlbresults4,"z_boydmak.hivewritetest10",,"overwrite")

I can use beeline to query the new z_boydmak.hivewritetest10 table and see the data. Also, according to the Yarn RM UI, the application completed successfully. Thanks!

Mark

Don't have an account?
Coming from Hortonworks? Activate your account here