About David_Tam

David_Tam · ‎08-21-2018

According to this spark jira this is only available (or planned) in spark 2.4. @jzhang could you confirm?

David_Tam · ‎08-21-2018

Hello thank you so much for this post it is awesome as we could ship any version of python interpreter to the hadoop cluster, even though the version of python is not installed. However what is the catch? One thing I have found is the "package" with only python 3.4 + pandas is already 450MB zipped up. Would this be an issue?

David_Tam · ‎05-23-2017

Hello @Robert Levas thank you for your reply. I am pretty sure the conf file is being picked up. If I comment out this line: livy.server.auth.type = kerberos Then the server can start up fine, and requests are served fine, just that without authentication. Adding or removing the following has no effect: livy.server.kerberos.keytab = /etc/security/keytabs/livy.headless.keytab even though the log kind of suggest it is looking for it, if kerberos is switched on.

David_Tam · ‎05-22-2017

hello @Robert Levas thank you for the detailed explanation. I tried what you said, but still getting the same error: [root@master livy]# whoami root [root@master livy]# hostname -f master.sandbox.lbg.com [root@master livy]# kadmin.local -q "addprinc -randkey livy/master.sandbox.lbg.com@LBG.COM" Authenticating as principal root/admin@LBG.COM with password. WARNING: no policy specified for livy/master.sandbox.lbg.com@LBG.COM; defaulting to no policy Principal "livy/master.sandbox.lbg.com@LBG.COM" created. [root@master livy]# kadmin.local -q "xst -k /etc/security/keytabs/livy.headless.keytab livy/master.sandbox.lbg.com@LBG.COM" Authenticating as principal root/admin@LBG.COM with password. Entry for principal livy/master.sandbox.lbg.com@LBG.COM with kvno 2, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:/etc/security/keytabs/livy.headless.keytab. Entry for principal livy/master.sandbox.lbg.com@LBG.COM with kvno 2, encryption type aes128-cts-hmac-sha1-96 added to keytab WRFILE:/etc/security/keytabs/livy.headless.keytab. Entry for principal livy/master.sandbox.lbg.com@LBG.COM with kvno 2, encryption type des3-cbc-sha1 added to keytab WRFILE:/etc/security/keytabs/livy.headless.keytab. Entry for principal livy/master.sandbox.lbg.com@LBG.COM with kvno 2, encryption type arcfour-hmac added to keytab WRFILE:/etc/security/keytabs/livy.headless.keytab. [root@master livy]# [root@master livy]# cat /etc/livy/conf/livy.conf livy.spark.master = yarn livy.spark.deployMode = cluster livy.environment production livy.impersonation.enabled true livy.server.csrf_protection.enabled true livy.server.port 8998 livy.server.session.timeout 3600000 livy.server.recovery.mode off livy.server.auth.type = kerberos livy.server.launch.kerberos.keytab = /etc/security/keytabs/livy.headless.keytab livy.server.launch.kerberos.principal = livy/master.sandbox.lbg.com@LBG.COM livy.server.kerberos.keytab = /etc/security/keytabs/livy.headless.keytab livy.server.auth.kerberos.keytab /etc/security/keytabs/spnego.service.keytab livy.server.auth.kerberos.principal HTTP/_HOST@LBG.COM livy.superusers=livy Then: [livy@master bin]$ whoami livy [livy@master bin]$ hostname -f master.sandbox.lbg.com [livy@master bin]$ ls -al /etc/security/keytabs/livy.headless.keytab -rw------- 1 livy hadoop 546 May 22 08:59 /etc/security/keytabs/livy.headless.keytab [livy@master bin]$ klist -kte /etc/security/keytabs/livy.headless.keytab Keytab name: FILE:/etc/security/keytabs/livy.headless.keytab KVNO Timestamp Principal ---- ----------------- -------------------------------------------------------- 2 05/19/17 09:41:41 livy@LBG.COM (aes256-cts-hmac-sha1-96) 2 05/19/17 09:41:41 livy@LBG.COM (aes128-cts-hmac-sha1-96) 2 05/19/17 09:41:41 livy@LBG.COM (des3-cbc-sha1) 2 05/19/17 09:41:41 livy@LBG.COM (arcfour-hmac) 2 05/22/17 08:59:56 livy/master.sandbox.lbg.com@LBG.COM (aes256-cts-hmac-sha1-96) 2 05/22/17 08:59:56 livy/master.sandbox.lbg.com@LBG.COM (aes128-cts-hmac-sha1-96) 2 05/22/17 08:59:56 livy/master.sandbox.lbg.com@LBG.COM (des3-cbc-sha1) 2 05/22/17 08:59:56 livy/master.sandbox.lbg.com@LBG.COM (arcfour-hmac) [livy@master bin]$ /usr/hdp/current/livy-server/bin/livy-server start starting /usr/java/default/bin/java -Xmx2g -cp /usr/hdp/current/livy-server/jars/*:/usr/hdp/current/livy-server/conf: com.cloudera.livy.server.LivyServer, logging to /var/log/livy/livy-livy-server.out [livy@master bin]$ cat /var/log/livy/livy-livy-server.out log4j:WARN No appenders could be found for logger (com.cloudera.livy.server.LivyServer). log4j:WARN Please initialize the log4j system properly. log4j:WARN See <a href="http://logging.apache.org/log4j/1.2/faq.html#noconfig">http://logging.apache.org/log4j/1.2/faq.html#noconfig</a> for more info. Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Kerberos requires livy.server.kerberos.keytab to be provided. at scala.Predef$.require(Predef.scala:233) at com.cloudera.livy.server.LivyServer.runKinit(LivyServer.scala:173) at com.cloudera.livy.server.LivyServer.start(LivyServer.scala:134) at com.cloudera.livy.server.LivyServer$.main(LivyServer.scala:277) at com.cloudera.livy.server.LivyServer.main(LivyServer.scala) [livy@master root]$ /usr/hdp/current/livy-server/bin/livy-server stop no livy_server to stop So if you did all you said on your box you were able to start up livy without this issue?

David_Tam · ‎05-19-2017

Thanks @Robert Levas for your input I can confirm I had run through step 8 and the keytab exists: [root@master bin]$ ls -al /etc/security/keytabs/livy.headless.keytab -rw------- 1 livy hadoop 226 May 19 09:41 /etc/security/keytabs/livy.headless.keytab

David_Tam · ‎05-19-2017

Hello I am using kerberoized HDP 2.5.3 and I am trying out livy. I have setup according to https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_command-line-installation/content/configure_livy.html and without kerberos livy server can startup and response to http request. However if I add the kerberos section in livy.conf then I get this in the log: [root@master conf]# tail -1000f /var/log/livy/livy-livy-server.out log4j:WARN No appenders could be found for logger (com.cloudera.livy.server.LivyServer). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Kerberos requires livy.server.kerberos.keytab to be provided. at scala.Predef$.require(Predef.scala:233) at com.cloudera.livy.server.LivyServer.runKinit(LivyServer.scala:173) at com.cloudera.livy.server.LivyServer.start(LivyServer.scala:134) at com.cloudera.livy.server.LivyServer$.main(LivyServer.scala:277) at com.cloudera.livy.server.LivyServer.main(LivyServer.scala) Any idea? Following is the livy.conf [root@master conf]# cat /etc/livy/conf/livy.conf livy.spark.master = yarn livy.spark.deployMode = cluster livy.environment production livy.impersonation.enabled true livy.server.csrf_protection.enabled true livy.server.port 8998 livy.server.session.timeout 3600000 livy.server.recovery.mode off #livy.server.auth.type = kerberos livy.server.launch.kerberos.keytab = /etc/security/keytabs/livy.headless.keytab livy.server.launch.kerberos.principal = livy/_HOST@LBG.COM #livy.server.kerberos.keytab = /etc/security/keytabs/livy.headless.keytab livy.server.auth.kerberos.keytab /etc/security/keytabs/spnego.service.keytab livy.server.auth.kerberos.principal HTTP/_HOST@LBG.COM livy.superusers=livy Thank you in advance!

David_Tam · ‎05-18-2017

In case anyone still need this and get the same problem as me, when using kerberos HDP. You need to follow these steps to setup HTTP authentication before spnego would work for the YARN REST API Once these are added in ambari and corresponding services restarted, I was able to get new application ID. However any previous anonymous access to Resource Manager now needs authentication! [user@master hdfs]$ curl --negotiate -u : -v -X POST http://<fqdn>:8088/ws/v1/cluster/apps/new-application * About to connect() to <fqdn> port 8088 (#0) * Trying 192.168.33.11... connected * Connected to <fqdn> (192.168.33.11) port 8088 (#0) > POST /ws/v1/cluster/apps/new-application HTTP/1.1 > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2 > Host: <fqdn>:8088 > Accept: */* > < HTTP/1.1 401 Authentication required < Cache-Control: must-revalidate,no-cache,no-store < Date: Thu, 18 May 2017 13:42:56 GMT < Pragma: no-cache < Date: Thu, 18 May 2017 13:42:56 GMT < Pragma: no-cache < Content-Type: text/html; charset=iso-8859-1 < WWW-Authenticate: Negotiate < Set-Cookie: hadoop.auth=; Path=/; Domain=<my domain>; HttpOnly < Content-Length: 1427 < Server: Jetty(6.1.26.hwx) < * Ignoring the response-body * Connection #0 to host <fqdn> left intact * Issue another request to this URL: 'http://<fqdn>:8088/ws/v1/cluster/apps/new-application' * Re-using existing connection! (#0) with host <fqdn> * Connected to <fqdn> (192.168.33.11) port 8088 (#0) * Server auth using GSS-Negotiate with user '' > POST /ws/v1/cluster/apps/new-application HTTP/1.1 > Authorization: Negotiate YIICaQYJKoZIhvcSAQICAQBuggJYMIICVKADAgEFoQMCAQ6iBwMFAAAAAACjggFtYYIBaTCCAWWgAwIBBaEJGwdMQkcuQ09NoikwJ6ADAgEDoSAwHhsESFRUUBsWd29ya2VyLnNhbmRib3gubGJnLmNvbaOCASYwggEioAMCARKhAwIBAaKCARQEggEQYSIFHaKqdpjxjhuBgc+K8swimOG+UeIxhgNYhOEQXgfFcyQoNRcPwFRS0nbXeLN8HT8S8QEib5/KXJoPj0On7r7gWNDlcYEI9ycAJ8xe11FE5WTMgSL2BDeiOtA6OLLYGj5rHFCwsWByLBwBu8jI5Bmmnx93jN+XkjPWxvrS3dBwU3qDiwbWfqze34JDfLBAWJBjke0KcFCrzA9an4fw7Evvflu9NtT/XixW7edfF0+anV/tcrBSPqj1UFKqqNr2bYOdes3pApixmohe9xAvCd4Wg6T5JLUwRlbfdt/beqwMwkY0a1WpnnFOeuOoB6ReUIcsufmRZGMkrIh63mIz/O13lbQlzXOhjBfwKyiMo/Kkgc0wgcqgAwIBEqKBwgSBvwZfOYFelpjopPr89JOyFtKzPC6xxCyLjNGAZHMFF/VKHKtdytbf7Dy5YNtcoCK1nu2D8Ihkum1hYaxH1ugK4i5sKU8xaAp0qNanc6Lu+Y7sUH/s5XKCqwVQM96mYC0ejpWIq8WDrB3CX5+MshSOnbeIEcMyG8puQ/5nHfUlNsOC7vhq4Qbs8yTTqG+9W7+79sl9fbhmVqIOx5UUfHXtq3qkKAtgmSoQhpDi4ERC/bYBIMYyubtPiXKC/k0JxSyn > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2 > Host: <fqdn>:8088 > Accept: */* > < HTTP/1.1 200 OK < Cache-Control: no-cache < Expires: Thu, 18 May 2017 13:42:56 GMT < Date: Thu, 18 May 2017 13:42:56 GMT < Pragma: no-cache < Expires: Thu, 18 May 2017 13:42:56 GMT < Date: Thu, 18 May 2017 13:42:56 GMT < Pragma: no-cache < Content-Type: application/json < Set-Cookie: hadoop.auth="u=user&p=user@<my domain>&t=kerberos&e=1495150976195&s=NxiE0Svo7+3QTPXC8L9aUlPN54c="; Path=/; Domain=<my domain>; HttpOnly < X-Frame-Options: SAMEORIGIN < Transfer-Encoding: chunked < Server: Jetty(6.1.26.hwx) < * Connection #0 to host <fqdn> left intact * Closing connection #0 {"application-id":"application_1495114416899_0001","maximum-resource-capability":{"memory":4096,"vCores":6}}[user@master hdfs]$

David_Tam · ‎05-17-2017

Hello @shashi kumar @Terry Stebbens have you found a solution to this? I am getting the same issue. I am trying locally on a kerberized sandbox. I have kinit before I run: curl --negotiate -u : -v -X POST 'http://<fqdn>:8088/ws/v1/cluster/apps/new-application' getting: * About to connect() to <fqdn> port 8088 (#0) * Trying 192.168.33.11... connected * Connected to <fqdn> (192.168.33.11) port 8088 (#0) > POST /ws/v1/cluster/apps/new-application HTTP/1.1 > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2 > Host: <fqdn>:8088 > Accept: */* > < HTTP/1.1 403 Forbidden < Cache-Control: no-cache < Expires: Wed, 17 May 2017 20:03:25 GMT < Date: Wed, 17 May 2017 20:03:25 GMT < Pragma: no-cache < Expires: Wed, 17 May 2017 20:03:25 GMT < Date: Wed, 17 May 2017 20:03:25 GMT < Pragma: no-cache < Content-Type: application/json < X-Frame-Options: SAMEORIGIN < Transfer-Encoding: chunked < Server: Jetty(6.1.26.hwx) < * Connection #0 to host <fqdn> left intact * Closing connection #0 The default static user cannot carry out this operation.

David_Tam · ‎03-29-2017

After looking at the source and this jira I have finally got round to do it in spark-shell: import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil import org.apache.hadoop.conf.Configuration import org.apache.phoenix.spark._ val conf = new Configuration conf.setLong(PhoenixConfigurationUtil.CURRENT_SCN_VALUE, 1490706922000) sqlContext.phoenixTableAsDataFrame("TRADE.TRADE", Array("ID", "BROKER"), conf = conf).show

David_Tam · ‎03-27-2017

How to specify / run flashback / snapshot query in spark-shell given one has already setup phoenix-spark? Assuming it has already been setup with the correct jars to run this: import org.apache.phoenix.spark._ val df_new = sc.parallelize(Seq((345, "some org"), (456, "ORG_ID_1"))).toDF("ID", "ORG_ID") df_new.saveToPhoenix("MY_TABLE") // I presume it will be some param. within phoenixTableAsDataFrame? sqlContext.phoenixTableAsDataFrame("MY_TABLE", Array("ID", "ORG_ID")).show +-------+---------+ | ID| ORG_ID| +-------+---------+ | 456| ORD_ID_1| | 345| some org| +-------+---------+ Is it possible? Thank you!

Online	Offline
Last Visited	‎02-04-2019 10:23 PM

Member Since	‎01-21-2016 11:27 AM
Last Visited	‎02-04-2019 10:23 PM
Posts	66
Kudos received	44

Cloudera Community

Re: Running phoenix flashback queries / setting cu...

Re: Running phoenix flashback queries / setting cu...

Re: Phoenix / HBase problem with HDP 2.3.4 and Jav...

Re: Oozie SparkAction failing

Re: oozie SparkAction a simple job that extract-tr...

Re: Using VirtualEnv with PySpark

Re: Running PySpark with Conda Env

Re: kerberos livy :- "requirement failed: Kerberos...

Re: kerberos livy :- "requirement failed: Kerberos...

Re: kerberos livy :- "requirement failed: Kerberos...

kerberos livy :- "requirement failed: Kerberos req...

Re: How to request a applicationID using YARN REST...

Re: How to request a applicationID using YARN REST...

Re: Running phoenix flashback queries / setting cu...

Running phoenix flashback queries / setting curren...