Member since
01-21-2016
66
Posts
44
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
920 | 03-29-2017 11:14 AM | |
889 | 03-27-2017 10:01 AM | |
1513 | 02-29-2016 10:00 AM | |
6038 | 01-28-2016 08:26 AM | |
1914 | 01-22-2016 03:55 PM |
02-04-2019
10:10 PM
So 2 years on I am now using hdp 2.6.4, but is this problem fixed or still an issue?
... View more
08-21-2018
12:22 PM
According to this spark jira this is only available (or planned) in spark 2.4. @jzhang could you confirm?
... View more
08-21-2018
10:47 AM
Hello thank you so much for this post it is awesome as we could ship any version of python interpreter to the hadoop cluster, even though the version of python is not installed. However what is the catch? One thing I have found is the "package" with only python 3.4 + pandas is already 450MB zipped up. Would this be an issue?
... View more
07-19-2017
03:29 PM
@Mark Heydenrych
this is strange - could it be that your dataframe (I supposed you are using dataframe?) is a product of some repartition / shuffle op. somewhere up the DAG? from my exp. phoenix spark break down the partitions into # of salt bucket, which is also the same as # of region in HBase? (correct me if I am wrong)
... View more
07-19-2017
08:38 AM
Hello by Spark-Phoenix integration do you mean phonenix spark? From my experience with phoenix spark it should not be this slow. What version are you using? For the number of tasks - are you loading into phoenix table with salt bucket = 200?
... View more
06-08-2017
01:29 PM
Ok I suppose this article https://community.hortonworks.com/articles/106089/dropping-a-local-index-breaks-scn-query-in-phoenix.html kind of suggest it will be fixed in 2.6...
... View more
05-23-2017
08:20 AM
Hello @Robert Levas thank you for your reply. I am pretty sure the conf file is being picked up. If I comment out this line: livy.server.auth.type = kerberos
Then the server can start up fine, and requests are served fine, just that without authentication. Adding or removing the following has no effect: livy.server.kerberos.keytab = /etc/security/keytabs/livy.headless.keytab
even though the log kind of suggest it is looking for it, if kerberos is switched on.
... View more
05-22-2017
08:14 AM
hello @Robert Levas thank you for the detailed explanation. I tried what you said, but still getting the same error: [root@master livy]# whoami
root
[root@master livy]# hostname -f
master.sandbox.lbg.com
[root@master livy]# kadmin.local -q "addprinc -randkey livy/master.sandbox.lbg.com@LBG.COM"
Authenticating as principal root/admin@LBG.COM with password.
WARNING: no policy specified for livy/master.sandbox.lbg.com@LBG.COM; defaulting to no policy
Principal "livy/master.sandbox.lbg.com@LBG.COM" created.
[root@master livy]# kadmin.local -q "xst -k /etc/security/keytabs/livy.headless.keytab livy/master.sandbox.lbg.com@LBG.COM"
Authenticating as principal root/admin@LBG.COM with password.
Entry for principal livy/master.sandbox.lbg.com@LBG.COM with kvno 2, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:/etc/security/keytabs/livy.headless.keytab.
Entry for principal livy/master.sandbox.lbg.com@LBG.COM with kvno 2, encryption type aes128-cts-hmac-sha1-96 added to keytab WRFILE:/etc/security/keytabs/livy.headless.keytab.
Entry for principal livy/master.sandbox.lbg.com@LBG.COM with kvno 2, encryption type des3-cbc-sha1 added to keytab WRFILE:/etc/security/keytabs/livy.headless.keytab.
Entry for principal livy/master.sandbox.lbg.com@LBG.COM with kvno 2, encryption type arcfour-hmac added to keytab WRFILE:/etc/security/keytabs/livy.headless.keytab.
[root@master livy]#
[root@master livy]# cat /etc/livy/conf/livy.conf
livy.spark.master = yarn
livy.spark.deployMode = cluster
livy.environment production
livy.impersonation.enabled true
livy.server.csrf_protection.enabled true
livy.server.port 8998
livy.server.session.timeout 3600000
livy.server.recovery.mode off
livy.server.auth.type = kerberos
livy.server.launch.kerberos.keytab = /etc/security/keytabs/livy.headless.keytab
livy.server.launch.kerberos.principal = livy/master.sandbox.lbg.com@LBG.COM
livy.server.kerberos.keytab = /etc/security/keytabs/livy.headless.keytab
livy.server.auth.kerberos.keytab /etc/security/keytabs/spnego.service.keytab
livy.server.auth.kerberos.principal HTTP/_HOST@LBG.COM
livy.superusers=livy
Then: [livy@master bin]$ whoami
livy
[livy@master bin]$ hostname -f
master.sandbox.lbg.com
[livy@master bin]$ ls -al /etc/security/keytabs/livy.headless.keytab
-rw------- 1 livy hadoop 546 May 22 08:59 /etc/security/keytabs/livy.headless.keytab
[livy@master bin]$ klist -kte /etc/security/keytabs/livy.headless.keytab
Keytab name: FILE:/etc/security/keytabs/livy.headless.keytab
KVNO Timestamp Principal
---- ----------------- --------------------------------------------------------
2 05/19/17 09:41:41 livy@LBG.COM (aes256-cts-hmac-sha1-96)
2 05/19/17 09:41:41 livy@LBG.COM (aes128-cts-hmac-sha1-96)
2 05/19/17 09:41:41 livy@LBG.COM (des3-cbc-sha1)
2 05/19/17 09:41:41 livy@LBG.COM (arcfour-hmac)
2 05/22/17 08:59:56 livy/master.sandbox.lbg.com@LBG.COM (aes256-cts-hmac-sha1-96)
2 05/22/17 08:59:56 livy/master.sandbox.lbg.com@LBG.COM (aes128-cts-hmac-sha1-96)
2 05/22/17 08:59:56 livy/master.sandbox.lbg.com@LBG.COM (des3-cbc-sha1)
2 05/22/17 08:59:56 livy/master.sandbox.lbg.com@LBG.COM (arcfour-hmac)
[livy@master bin]$ /usr/hdp/current/livy-server/bin/livy-server start
starting /usr/java/default/bin/java -Xmx2g -cp /usr/hdp/current/livy-server/jars/*:/usr/hdp/current/livy-server/conf: com.cloudera.livy.server.LivyServer, logging to /var/log/livy/livy-livy-server.out
[livy@master bin]$ cat /var/log/livy/livy-livy-server.out
log4j:WARN No appenders could be found for logger (com.cloudera.livy.server.LivyServer).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See <a href="http://logging.apache.org/log4j/1.2/faq.html#noconfig">http://logging.apache.org/log4j/1.2/faq.html#noconfig</a> for more info.
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Kerberos requires livy.server.kerberos.keytab to be provided.
at scala.Predef$.require(Predef.scala:233)
at com.cloudera.livy.server.LivyServer.runKinit(LivyServer.scala:173)
at com.cloudera.livy.server.LivyServer.start(LivyServer.scala:134)
at com.cloudera.livy.server.LivyServer$.main(LivyServer.scala:277)
at com.cloudera.livy.server.LivyServer.main(LivyServer.scala)
[livy@master root]$ /usr/hdp/current/livy-server/bin/livy-server stop
no livy_server to stop
So if you did all you said on your box you were able to start up livy without this issue?
... View more
05-19-2017
03:51 PM
Thanks @Robert Levas for your input I can confirm I had run through step 8 and the keytab exists: [root@master bin]$ ls -al /etc/security/keytabs/livy.headless.keytab
-rw------- 1 livy hadoop 226 May 19 09:41 /etc/security/keytabs/livy.headless.keytab
... View more
05-19-2017
12:52 PM
Hello I am using kerberoized HDP 2.5.3 and I am trying out livy. I have setup according to https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_command-line-installation/content/configure_livy.html and without kerberos livy server can startup and response to http request. However if I add the kerberos section in livy.conf then I get this in the log: [root@master conf]# tail -1000f /var/log/livy/livy-livy-server.out
log4j:WARN No appenders could be found for logger (com.cloudera.livy.server.LivyServer).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Kerberos requires livy.server.kerberos.keytab to be provided.
at scala.Predef$.require(Predef.scala:233)
at com.cloudera.livy.server.LivyServer.runKinit(LivyServer.scala:173)
at com.cloudera.livy.server.LivyServer.start(LivyServer.scala:134)
at com.cloudera.livy.server.LivyServer$.main(LivyServer.scala:277)
at com.cloudera.livy.server.LivyServer.main(LivyServer.scala)
Any idea? Following is the livy.conf [root@master conf]# cat /etc/livy/conf/livy.conf
livy.spark.master = yarn
livy.spark.deployMode = cluster
livy.environment production
livy.impersonation.enabled true
livy.server.csrf_protection.enabled true
livy.server.port 8998
livy.server.session.timeout 3600000
livy.server.recovery.mode off
#livy.server.auth.type = kerberos
livy.server.launch.kerberos.keytab = /etc/security/keytabs/livy.headless.keytab
livy.server.launch.kerberos.principal = livy/_HOST@LBG.COM
#livy.server.kerberos.keytab = /etc/security/keytabs/livy.headless.keytab
livy.server.auth.kerberos.keytab /etc/security/keytabs/spnego.service.keytab
livy.server.auth.kerberos.principal HTTP/_HOST@LBG.COM
livy.superusers=livy
Thank you in advance!
... View more
Labels:
- Labels:
-
Apache Spark
05-18-2017
02:06 PM
In case anyone still need this and get the same problem as me, when using kerberos HDP. You need to follow these steps to setup HTTP authentication before spnego would work for the YARN REST API Once these are added in ambari and corresponding services restarted, I was able to get new application ID. However any previous anonymous access to Resource Manager now needs authentication! [user@master hdfs]$ curl --negotiate -u : -v -X POST http://<fqdn>:8088/ws/v1/cluster/apps/new-application
* About to connect() to <fqdn> port 8088 (#0)
* Trying 192.168.33.11... connected
* Connected to <fqdn> (192.168.33.11) port 8088 (#0)
> POST /ws/v1/cluster/apps/new-application HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: <fqdn>:8088
> Accept: */*
>
< HTTP/1.1 401 Authentication required
< Cache-Control: must-revalidate,no-cache,no-store
< Date: Thu, 18 May 2017 13:42:56 GMT
< Pragma: no-cache
< Date: Thu, 18 May 2017 13:42:56 GMT
< Pragma: no-cache
< Content-Type: text/html; charset=iso-8859-1
< WWW-Authenticate: Negotiate
< Set-Cookie: hadoop.auth=; Path=/; Domain=<my domain>; HttpOnly
< Content-Length: 1427
< Server: Jetty(6.1.26.hwx)
<
* Ignoring the response-body
* Connection #0 to host <fqdn> left intact
* Issue another request to this URL: 'http://<fqdn>:8088/ws/v1/cluster/apps/new-application'
* Re-using existing connection! (#0) with host <fqdn>
* Connected to <fqdn> (192.168.33.11) port 8088 (#0)
* Server auth using GSS-Negotiate with user ''
> POST /ws/v1/cluster/apps/new-application HTTP/1.1
> Authorization: Negotiate YIICaQYJKoZIhvcSAQICAQBuggJYMIICVKADAgEFoQMCAQ6iBwMFAAAAAACjggFtYYIBaTCCAWWgAwIBBaEJGwdMQkcuQ09NoikwJ6ADAgEDoSAwHhsESFRUUBsWd29ya2VyLnNhbmRib3gubGJnLmNvbaOCASYwggEioAMCARKhAwIBAaKCARQEggEQYSIFHaKqdpjxjhuBgc+K8swimOG+UeIxhgNYhOEQXgfFcyQoNRcPwFRS0nbXeLN8HT8S8QEib5/KXJoPj0On7r7gWNDlcYEI9ycAJ8xe11FE5WTMgSL2BDeiOtA6OLLYGj5rHFCwsWByLBwBu8jI5Bmmnx93jN+XkjPWxvrS3dBwU3qDiwbWfqze34JDfLBAWJBjke0KcFCrzA9an4fw7Evvflu9NtT/XixW7edfF0+anV/tcrBSPqj1UFKqqNr2bYOdes3pApixmohe9xAvCd4Wg6T5JLUwRlbfdt/beqwMwkY0a1WpnnFOeuOoB6ReUIcsufmRZGMkrIh63mIz/O13lbQlzXOhjBfwKyiMo/Kkgc0wgcqgAwIBEqKBwgSBvwZfOYFelpjopPr89JOyFtKzPC6xxCyLjNGAZHMFF/VKHKtdytbf7Dy5YNtcoCK1nu2D8Ihkum1hYaxH1ugK4i5sKU8xaAp0qNanc6Lu+Y7sUH/s5XKCqwVQM96mYC0ejpWIq8WDrB3CX5+MshSOnbeIEcMyG8puQ/5nHfUlNsOC7vhq4Qbs8yTTqG+9W7+79sl9fbhmVqIOx5UUfHXtq3qkKAtgmSoQhpDi4ERC/bYBIMYyubtPiXKC/k0JxSyn
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: <fqdn>:8088
> Accept: */*
>
< HTTP/1.1 200 OK
< Cache-Control: no-cache
< Expires: Thu, 18 May 2017 13:42:56 GMT
< Date: Thu, 18 May 2017 13:42:56 GMT
< Pragma: no-cache
< Expires: Thu, 18 May 2017 13:42:56 GMT
< Date: Thu, 18 May 2017 13:42:56 GMT
< Pragma: no-cache
< Content-Type: application/json
< Set-Cookie: hadoop.auth="u=user&p=user@<my domain>&t=kerberos&e=1495150976195&s=NxiE0Svo7+3QTPXC8L9aUlPN54c="; Path=/; Domain=<my domain>; HttpOnly
< X-Frame-Options: SAMEORIGIN
< Transfer-Encoding: chunked
< Server: Jetty(6.1.26.hwx)
<
* Connection #0 to host <fqdn> left intact
* Closing connection #0
{"application-id":"application_1495114416899_0001","maximum-resource-capability":{"memory":4096,"vCores":6}}[user@master hdfs]$
... View more
05-17-2017
08:09 PM
Hello @shashi kumar @Terry Stebbens have you found a solution to this? I am getting the same issue. I am trying locally on a kerberized sandbox. I have kinit before I run: curl --negotiate -u : -v -X POST 'http://<fqdn>:8088/ws/v1/cluster/apps/new-application' getting: * About to connect() to <fqdn> port 8088 (#0)
* Trying 192.168.33.11... connected
* Connected to <fqdn> (192.168.33.11) port 8088 (#0)
> POST /ws/v1/cluster/apps/new-application HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: <fqdn>:8088
> Accept: */*
>
< HTTP/1.1 403 Forbidden
< Cache-Control: no-cache
< Expires: Wed, 17 May 2017 20:03:25 GMT
< Date: Wed, 17 May 2017 20:03:25 GMT
< Pragma: no-cache
< Expires: Wed, 17 May 2017 20:03:25 GMT
< Date: Wed, 17 May 2017 20:03:25 GMT
< Pragma: no-cache
< Content-Type: application/json
< X-Frame-Options: SAMEORIGIN
< Transfer-Encoding: chunked
< Server: Jetty(6.1.26.hwx)
<
* Connection #0 to host <fqdn> left intact
* Closing connection #0
The default static user cannot carry out this operation.
... View more
04-17-2017
12:50 PM
@Rajeshbabu Chintaguntla right sorry I actually don't understand your first answer. From what I have seen once the index is changed the flash back query no longer works unless I force it to use no index. So do you mean there is a bug in phoenix or if I waited sufficient time I will see the flashback query works again?
... View more
04-17-2017
12:50 PM
umm ok but isnt the whole point of flashback snapshot query is that the query goes back to that moment in time and any subsequent or current update to hbase will not be reflected in the snapshot query?
... View more
04-17-2017
12:50 PM
Hello I am using HDP 2.5.3 and Phoenix 4.7 and it seems to me flashback / snapshot query would not work if the latest index / indexed data changes. Below is what I have done and experienced: -- creating the schema and table
CREATE SCHEMA IF NOT EXISTS TRADE;
DROP TABLE IF EXISTS TRADE.TRADE;
CREATE TABLE IF NOT EXISTS TRADE.TRADE (ID VARCHAR, BROKER VARCHAR CONSTRAINT pk PRIMARY KEY (ID));
CREATE LOCAL INDEX TRADE_BROKER_IDX on TRADE.TRADE (BROKER);
-- insert some date
0: jdbc:phoenix:> UPSERT INTO TRADE.TRADE VALUES ('123', 'BROKER123');
1 row affected (0.085 seconds)
0: jdbc:phoenix:> UPSERT INTO TRADE.TRADE VALUES ('456', 'BROKER456');
1 row affected (0.036 seconds)
0: jdbc:phoenix:> SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+------+------------+
| ID | BROKER |
+------+------------+
| 456 | BROKER456 |
+------+------------+
1 row selected (0.107 seconds)
0: jdbc:phoenix:> EXPLAIN SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+----------------------------------------------------------------------------------------+
| PLAN |
+----------------------------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER TRADE:TRADE [1,'BROKER456'] |
| SERVER FILTER BY FIRST KEY ONLY |
+----------------------------------------------------------------------------------------+
2 rows selected (0.054 seconds)
-- logged current time
0: jdbc:phoenix:> select cast(CURRENT_TIME() as BIGINT) from TRADE.TRADE limit 1;
+--------------------------------------------+
| TO_BIGINT(TIME '2017-03-28 14:50:38.678') |
+--------------------------------------------+
| 1490706922000 |
+--------------------------------------------+
0: jdbc:phoenix:sandbox:2181/h> !quit
[root]$ sqlline.py "sandbox:2181/hbase-secure;currentSCN=1490706922000"
0: jdbc:phoenix:sandbox:2181/h> -- showing what the snapshot data should look like
0: jdbc:phoenix:sandbox:2181/h> SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+------+------------+
| ID | BROKER |
+------+------------+
| 456 | BROKER456 |
+------+------------+
1 row selected (0.197 seconds)
0: jdbc:phoenix:sandbox:2181/h> -- showing what the select is using correct index
0: jdbc:phoenix:sandbox:2181/h> EXPLAIN SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+----------------------------------------------------------------------------------------+
| PLAN |
+----------------------------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER TRADE:TRADE [1,'BROKER456'] |
| SERVER FILTER BY FIRST KEY ONLY |
+----------------------------------------------------------------------------------------+
2 rows selected (0.044 seconds)
0: jdbc:phoenix:sandbox:2181/h> !quit
[root]$ sqlline.py
0: jdbc:phoenix:> -- now make changes to the index and indexed data
0: jdbc:phoenix:> DROP INDEX TRADE_BROKER_IDX ON TRADE.TRADE;
2 rows affected (0.748 seconds)
0: jdbc:phoenix:> UPSERT INTO TRADE.TRADE VALUES ('456', '456BROKER');
1 row affected (0.1 seconds)
0: jdbc:phoenix:> SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = '456BROKER';
+------+------------+
| ID | BROKER |
+------+------------+
| 456 | 456BROKER |
+------+------------+
1 row selected (0.123 seconds)
0: jdbc:phoenix:master.sandbox.lbg.com:2181/h> CREATE LOCAL INDEX TRADE_BROKER_IDX on TRADE.TRADE (BROKER);
2 rows affected (5.212 seconds)
0: jdbc:phoenix:> EXPLAIN SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+-----------------------------------------------------------------------+
| PLAN |
+-----------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER TRADE:TRADE |
| SERVER FILTER BY BROKER = 'BROKER456' |
+-----------------------------------------------------------------------+
2 rows selected (0.042 seconds)
0: jdbc:phoenix:> !quit
[root]$ sqlline.py "sandbox:2181/hbase-secure;currentSCN=1490706922000"
0: jdbc:phoenix:sandbox:2181/h> -- I would expect snaphot to use index, but in this case it is using
0: jdbc:phoenix:sandbox:2181/h> -- latest index state - i.e. no index and full scan
0: jdbc:phoenix:sandbox:2181/h> SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+------+------------+
| ID | BROKER |
+------+------------+
| 456 | BROKER456 |
+------+------------+
1 row selected (0.168 seconds)
0: jdbc:phoenix:sandbox:2181/h> EXPLAIN SELECT ID, BROKER FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+-----------------------------------------------------------------------+
| PLAN |
+-----------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER TRADE:TRADE |
| SERVER FILTER BY BROKER = 'BROKER456' |
+-----------------------------------------------------------------------+
2 rows selected (0.035 seconds)
0: jdbc:phoenix:sandbox:2181/h> !quit
[root]$ sqlline.py "sandbox:2181/hbase-secure;currentSCN=1490706922000"
0: jdbc:phoenix:sandbox:2181/h> -- now recreate index and update indexed data yet again
0: jdbc:phoenix:> CREATE LOCAL INDEX TRADE_BROKER_IDX on TRADE.TRADE (BROKER);
2 rows affected (5.23 seconds)
0: jdbc:phoenix:> UPSERT INTO TRADE.TRADE VALUES ('456', 'BROKER-456');
1 row affected (0.021 seconds)
0: jdbc:phoenix:> -- at latest state correctly retrieved data and use index in explain
0: jdbc:phoenix:> SELECT * FROM TRADE.TRADE;
+------+-------------+
| ID | BROKER |
+------+-------------+
| 456 | BROKER-456 |
| 123 | BROKER123 |
+------+-------------+
2 rows selected (0.14 seconds)
0: jdbc:phoenix:> EXPLAIN SELECT * FROM TRADE.TRADE WHERE BROKER = 'BROKER-456';
+-----------------------------------------------------------------------------------------+
| PLAN |
+-----------------------------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER TRADE:TRADE [2,'BROKER-456'] |
| SERVER FILTER BY FIRST KEY ONLY |
+-----------------------------------------------------------------------------------------+
2 rows selected (0.047 seconds)
0: jdbc:phoenix:> !quit
[root]$ sqlline.py "sandbox:2181/hbase-secure;currentSCN=1490706922000"
0: jdbc:phoenix:sandbox:2181/h> -- UNEXPECTEDLY NOTHING IS RETURNED!
0: jdbc:phoenix:sandbox:2181/h> SELECT * FROM TRADE.TRADE ;
+-----+---------+
| ID | BROKER |
+-----+---------+
+-----+---------+
No rows selected (0.066 seconds)
0: jdbc:phoenix:sandbox:2181/h> EXPLAIN SELECT * FROM TRADE.TRADE ;
+----------------------------------------------------------------------------+
| PLAN |
+----------------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER TRADE:TRADE [1] |
| SERVER FILTER BY FIRST KEY ONLY |
+----------------------------------------------------------------------------+
2 rows selected (0.031 seconds)
0: jdbc:phoenix:sandbox:2181/h> SELECT * FROM TRADE.TRADE WHERE BROKER = 'BROKER456';
+-----+---------+
| ID | BROKER |
+-----+---------+
+-----+---------+
No rows selected (0.077 seconds)
0: jdbc:phoenix:sandbox:2181/h> -- AND IT WORKS IF FORCING PHOENIX TO DO FULL SCAN
0: jdbc:phoenix:sandbox:2181/h> SELECT /*+NO_INDEX*/ * FROM TRADE.TRADE ;
+------+------------+
| ID | BROKER |
+------+------------+
| 123 | BROKER123 |
| 456 | BROKER456 |
+------+------------+
2 rows selected (0.07 seconds)
Anyone had the same issue? How did you resolve it or is this a phoenix bug? If this is a bug is there plan to fix this in which HDP / Phoenix version? Thank you!
... View more
Labels:
03-29-2017
11:14 AM
1 Kudo
After looking at the source and this jira I have finally got round to do it in spark-shell: import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil
import org.apache.hadoop.conf.Configuration
import org.apache.phoenix.spark._
val conf = new Configuration
conf.setLong(PhoenixConfigurationUtil.CURRENT_SCN_VALUE, 1490706922000)
sqlContext.phoenixTableAsDataFrame("TRADE.TRADE", Array("ID", "BROKER"), conf = conf).show
... View more
03-28-2017
01:54 PM
“PARALLEL 9-WAY FULL SCAN” - 9 is the salt_bucket you specified in you create table or index correct? From my limited experiences on sandbox it seems 32 is marginally faster than 8. I am yet to try to give it more and see if it would be faster or not. Also this is a full scan, if you can get it into "RANGE SCAN", it should be faster: +---------------------------------------------------------------------------------------------+
| PLAN |
+---------------------------------------------------------------------------------------------+
| CLIENT 32-CHUNK PARALLEL 32-WAY ROUND ROBIN RANGE SCAN OVER SCHEMA:TABLE [1,'PREDICATE'] |
| SERVER FILTER BY FIRST KEY ONLY |
+---------------------------------------------------------------------------------------------+ 3. It seems "CHUNK" goes together with salt bucket as well. Please correct me if I am wrong.
... View more
03-27-2017
10:09 AM
How to specify / run flashback / snapshot query in spark-shell given one has already setup phoenix-spark? Assuming it has already been setup with the correct jars to run this: import org.apache.phoenix.spark._
val df_new = sc.parallelize(Seq((345, "some org"), (456, "ORG_ID_1"))).toDF("ID", "ORG_ID")
df_new.saveToPhoenix("MY_TABLE") // I presume it will be some param. within phoenixTableAsDataFrame?
sqlContext.phoenixTableAsDataFrame("MY_TABLE", Array("ID", "ORG_ID")).show
+-------+---------+
| ID| ORG_ID|
+-------+---------+
| 456| ORD_ID_1|
| 345| some org|
+-------+---------+ Is it possible? Thank you!
... View more
Labels:
- Labels:
-
Apache Phoenix
-
Apache Spark
03-27-2017
10:01 AM
Ok thanks I have finally got it working. It can be run like this for example: sqlline.py "sandbox:2181/hbase-secure;currentSCN=1490372958713"
... View more
03-24-2017
01:22 PM
Hello in phoenix doc here it suggested one can run a flashback query by setting "currenSCN" settings in connection properties. How do one set this properties in : sqlline.py command line utility? I am using HDP 2.5.3 / Phoenix 4.7 Thank you!!
... View more
Labels:
06-09-2016
01:15 PM
just want to add that it seems the spark.driver.extraClassPath is not necessary, at least in my case when I write file in snappy in spark using: rdd.saveAsTextFile(path, SnappyCodec.class)
... View more
06-08-2016
01:09 PM
Thanks @Rajkumar Singh I have tried the mapred.child.java.opts and a few of the mapreduce settings suggested by @Jitendra Yadav but at the end just adding this spark.executor.extraLibraryPath would do the job.
... View more
06-08-2016
09:10 AM
yes it is - as I said I was able to save in local mode.
... View more
06-08-2016
08:59 AM
1 Kudo
Hello, I have a spark job that is run by spark-submit and at the end it saves some output using SparkContext saveAsTextFile with SnappyCodec. To test I am using sandbox 2.3.4 This is running fine using master = local[x], after following the suggestion in this thread. However Once I changed master = yarn-client I am getting the same "native snappy library not available: this version of libhadoop was built without snappy support." on YARN. However I thought I have done all the necessary setup - any suggestions welcome! When I check the Spark History server I can see the following in the environment: Spark properties: system properties: Furthermore in Ambari -> Yarn -> config -> Advanced yarn-env -> yarn-env template -> LD_LIBRARY_PATH: Anything else I could do to make snappy available as a compression codec on YARN? Thank you.
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
06-07-2016
12:54 PM
@Jitendra Yadav - yes this has works thanks! This is how the process looks like when I run ps: root 17484 1 99 13:47 pts/0 00:00:59 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.65-0.b17.el6_7.x86_64/bin/java -server -XX:NewRatio=3 -XX:+UseConcMarkSweepGC -XX:-UseGCOverheadLimit -XX:CMSInitiatingOccupancyFraction=60 -Dsun.zip.disableMemoryMapping=true -Xms512m -Xmx2048m -Djava.security.auth.login.config=/etc/ambari-server/conf/krb5JAASLogin.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false -Djava.library.path=/usr/hdp/current/hadoop-client/lib/native -cp /etc/ambari-server/conf:/usr/lib/ambari-server/*:/usr/share/java/postgresql-jdbc.jar org.apache.ambari.server.controller.AmbariServer It seems setting -Djava.library.path is the only thing required - I have subsequently remove the snappy link in /usr/lib/ambari-server/ and can confirm it still works.
... View more
06-07-2016
11:54 AM
Thanks Scott yes this is what I think as well and therefore I have Linked snappy jars into ambari server lib. re the HCC questions you posted - that has helped me earlier and I was able to save out RDD in spark as compressed snappy file, but how do you make ambari HDFS Files vlew to use the snappy lib (I can only think of LD_LIBRARY_PATH or java classpath)
... View more
06-07-2016
11:42 AM
Hello I am trying to get File Preview for snappy files to work in Ambari -> HDFS Files. Could it be made to work? I assume it could because it seems to try decoding but failed in finding the snappy codec. This is running on a 2.3.4 sandbox. I am getting: 500 native snappy library not available: this version of libhadoop was built without snappy support. I have tried a few things: Adding snappy natives to LD_LIBRARY_PATH in /etc/profile Linking snappy jars into ambari server lib: cd /usr/lib/ambari-server/
ln -s /usr/hdp/current/hadoop-client/lib/snappy-java-1.0.4.1.jar . But I am still getting the same error. Any ideas would be great! Thanks! Full stack trace: 500 native snappy library not available: this version of libhadoop was built without snappy support. Collapse Stack Trace
java.lang.RuntimeException: native snappy library not available: this version of libhadoop was built without snappy support.
at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:65)
at org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:193)
at org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:178)
at org.apache.hadoop.io.compress.CompressionCodec$Util.createInputStreamWithCodecPool(CompressionCodec.java:157)
at org.apache.hadoop.io.compress.SnappyCodec.createInputStream(SnappyCodec.java:163)
at org.apache.ambari.view.filebrowser.FilePreviewService.previewFile(FilePreviewService.java:67)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:540)
at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:715)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1496)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:118)
at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:84)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:113)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:103)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:113)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:54)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:45)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
at org.apache.ambari.server.security.authorization.AmbariAuthorizationFilter.doFilter(AmbariAuthorizationFilter.java:196)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
at org.springframework.security.web.authentication.www.BasicAuthenticationFilter.doFilter(BasicAuthenticationFilter.java:150)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:87)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:192)
at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:160)
at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:237)
at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:167)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467)
at org.apache.ambari.server.api.MethodOverrideFilter.doFilter(MethodOverrideFilter.java:72)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467)
at org.apache.ambari.server.api.AmbariPersistFilter.doFilter(AmbariPersistFilter.java:47)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467)
at org.apache.ambari.server.security.AbstractSecurityHeaderFilter.doFilter(AbstractSecurityHeaderFilter.java:109)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467)
at org.apache.ambari.server.security.AbstractSecurityHeaderFilter.doFilter(AbstractSecurityHeaderFilter.java:109)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:294)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:429)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at org.apache.ambari.server.controller.AmbariHandlerList.processHandlers(AmbariHandlerList.java:216)
at org.apache.ambari.server.controller.AmbariHandlerList.processHandlers(AmbariHandlerList.java:205)
at org.apache.ambari.server.controller.AmbariHandlerList.handle(AmbariHandlerList.java:152)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:370)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:745)
... View more
Labels:
- Labels:
-
Apache Ambari
05-16-2016
11:53 AM
Hello I have forgotten about this, but at the end I have actually got it to work. What needs to be done was using kadmin to create a new keytab, and and add principal ambari-server@KRB.HDP to the keytab. Also it needs a full restart of the sandbox. See Setup Kerberos for Ambari Server Thanks to @Geoffrey Shelton Okot for pointing the right direction
... View more
03-21-2016
03:56 PM
1 Kudo
@Rahul Pathak - yes I have as I said they dont seem to work...
... View more
03-21-2016
03:35 PM
1 Kudo
Hello, I am using HDP 2.3.4 sandbox, the sandbox has been kerberoized. I have loaded some struct and array types data into Hive, where the schema looks like this: exampletable
|-- listOfPeople: array (nullable = false)
| |-- element: struct (containsNull = true)
| | |-- Name: string (nullable = false)
| | |-- id: integer (nullable = false)
| | |-- Email: string (nullable = false)
| | |-- holiday: array (nullable = false)
| | | |-- element: integer (containsNull = true)
|-- departmentName: string (nullable = false)
, and trying run this query in Hive View: SELECT explode(listofpeople.name) AS name from exampletable; with these Hive View settings: However I am getting these: INFO : Tez session hasn't been created yet. Opening session
ERROR : Failed to execute tez graph.
org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1458555995218_0002 failed 2 times due to AM Container for appattempt_1458555995218_0002_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://sandbox.hortonworks.com:8088/cluster/app/application_1458555995218_0002Then, click on links to logs of each attempt.
Diagnostics: Application application_1458555995218_0002 initialization failed (exitCode=255) with output: main : command provided 0
main : run as user is hive
main : requested yarn user is hive
Requested user hive is not whitelisted and has id 504,which is below the minimum allowed 1000
Failing this attempt. Failing the application.
at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:726)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:217)
at org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:271)
at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:151)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1703)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1460)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1101)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1096)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154)
at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
This thread kind of says this is the desire behaviour and google suggests to change the allowed.system.users in yarn-site (but it doesnt seems to work) If I just want to run the query successfully on the sandbox what needs to be done? Or what is the best practice solution for this? Thank you.
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hive
-
Apache Tez