Support Questions

smartninja723 · ‎01-23-2017

Hello Experts,

After going through Livy documentations available on the internet. It looks like Livy can impersonate the end user, we decide to test it for our use case before setting it up in Production.

I could run simple wordcount problem through Livy interpreter and read and write data to HDFS.

Next hurdle is to use it run queries, listing db/tables,etc.

I am trying to run using (one at a time, but I see the same error) using zeppelin notebook in a kerberized environment.

%livy

%livy.spark

%livy.sql

sqlContext.sql("show databases") // Doesn't work.

show databases // does not work.

sqlContext.sql("show databases").show() //does not work

Every time I see : java.lang.RuntimeException: [1.1] failure: ``with'' expected but identifier "use or select or sqlContext" found.

Any idea if I am missing something here?

Sc

However, when I try to run following I see the list of tables (but can't list databases yet)

1) Read the data from HDFS using sc.textFile()

2) Define Case class

3) Parse the file from step#1, and build the RDD of case objects

4) convert to DataFrame and create temp table

5) sqlContext.sql("show tables").toDF().show() // This lists only temp table.

I am interested in listing ALL THE DATBASES and ALL THE TABLES ( TEMP and PERM)

What is that I am missing here?

Thanks

smartninja723 · ‎01-23-2017

Tagging : @azeltov

jzhang · ‎01-23-2017

You should use

 show tables

For the error you see, this is a bug of livy interpreter IIRC. The workaround is that always run paragraph %livy first just as you did above.

smartninja723 · ‎01-23-2017

hi @jzhang, Thanks for quick response.

We are interested in listing the datbases more than only tables.

(This is because, want to make sure 'logged in user' sees only the databases and tables which he has access to.

Show tables currently lists only the tables which are registered as temp tables.

For example :

In Zeppelin Notebook UI

%livy sqlContext.sql("Show tables").show()	


+---------+-----------+
|tableName|isTemporary|
+---------+-----------+
|     bank|       true|
+---------+-----------+

Now for listing databases:

%livy sqlContext.sql("Show databases").show()

java.lang.RuntimeException: [1.1] failure: ``with'' expected but identifier Show found
Show databases
^
	at scala.sys.package$.error(package.scala:27)
	at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:36)
	at org.apache.spark.sql.catalyst.DefaultParserDialect.parse(ParserDialect.scala:67)
	at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:211)
	at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:211)

Now from spark-shell console ( after having valid kerb ticket, for the same user who logged in zeppelin UI)

scala> sqlContext.sql("show databases").show()


+-----------------+
|           result|
+-----------------+
|          default|
|         prod_tst|
|           hive01|
|         test_db1|
+-----------------+

How can we make "show databases" work in the zeppelin notebook.

Thanks.

jzhang · ‎01-23-2017

It works for me. Can you check the interpreter log ?

smartninja723 · ‎01-23-2017

@jzhang , Checking the log. there is not much info :

ERROR [2017-01-23 13:55:13,539] ({pool-2-thread-2} LivyHelper.java[createSession]:97) - sessionId:0.0 state is starting
ERROR [2017-01-23 13:55:14,569] ({pool-2-thread-2} LivyHelper.java[createSession]:97) - sessionId:0.0 state is starting
 INFO [2017-01-23 13:55:18,001] ({pool-2-thread-2} SchedulerFactory.java[jobFinished]:137) - Job remoteInterpretJob_1485179674236 finished by scheduler org.apache.zeppelin.livy.LivySparkInterpreter187724336

And in the Notebook UI :

java.lang.RuntimeException: [1.1] failure: ``with'' expected but identifier sqlContext found
sqlContext.sql("show databases").show()

jzhang · ‎01-23-2017

Do you have the full stacktrace ?

smartninja723 · ‎01-23-2017

I get the stracktrace in the UI, but not in the logs.

Copying the Notebook UI output.

%livy.spark

sqlContext

sqlContext.sql("show databases").show()

res20: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@52c70144

zeppelinuioutput.txt

smartninja723 · ‎01-24-2017

HI @jzhang , did you get a chance to look at it?

Thanks.

rbaskaran · ‎02-07-2017

Restart Spark, Livy , Zeppelin servers and interpreters. It worked for me.

Ram Baskaran

Cloudera Community

Support Questions

Listing Databases from Spark SQL from Zeppelin using Livy interpreter