Member since
03-25-2016
142
Posts
48
Kudos Received
7
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5793 | 06-13-2017 05:15 AM | |
1906 | 05-16-2017 05:20 AM | |
1344 | 03-06-2017 11:20 AM | |
7915 | 02-23-2017 06:59 AM | |
2226 | 02-20-2017 02:19 PM |
07-13-2017
03:06 PM
Hi Abraham, Spark interpreter is not impersonated. Uncheck <User Impersonate>, restart the interpreter and have another try.
... View more
07-13-2017
04:02 AM
@Miles Yao Good catch!!! Just updated. The phoenix jar is here to work with JDBC interpreter rather than spark.
... View more
07-11-2017
03:03 AM
@Gaurav Mallikarjuna In the above example you can notice that I used other method to connect to hiveserver2 - using hive2 node + its port number like $ beeline -u "jdbc:hive2://dkhdp261c6.openstacklocal:10000/" -n admin
Using admin is for my sample only. In your case - if your transport mode is binary and the cluster is NON kerberized - $ beeline -u "jdbc:hive2://<hiveserver2-hostname>:10000/" -n <username>
... View more
07-10-2017
10:46 AM
@Gaurav Mallikarjuna I tested the same as mine HDP 2.6.1 and could not see any issues [root@dkhdp262c6 ~]# beeline -u "jdbc:hive2://dkhdp263c6.openstacklocal:2181,dkhdp262c6.openstacklocal:2181,dkhdp261c6.openstacklocal:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" -n admin
Connecting to jdbc:hive2://dkhdp263c6.openstacklocal:2181,dkhdp262c6.openstacklocal:2181,dkhdp261c6.openstacklocal:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
Connected to: Apache Hive (version 1.2.1000.2.6.1.0-129)
Driver: Hive JDBC (version 1.2.1000.2.6.1.0-129)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1000.2.6.1.0-129 by Apache Hive
0: jdbc:hive2://dkhdp263c6.openstacklocal:218> show databases;
+----------------+--+
| database_name |
+----------------+--+
| default |
+----------------+--+
1 row selected (0.305 seconds)
0: jdbc:hive2://dkhdp263c6.openstacklocal:218>
This is non Kerberised environment though. One more thing, I have transport mode set to binary. What is yours? If your environment is also non-Kerberised and hive transport mode is binary, try the following: beeline -u "jdbc:hive2://dkhdp261c6.openstacklocal:10000/" -n admin The above is a hostname where your hiveserver2 is installed + its port number. Here is how this works my end: [root@dkhdp262c6 ~]# beeline -u "jdbc:hive2://dkhdp261c6.openstacklocal:10000/" -n admin
Connecting to jdbc:hive2://dkhdp261c6.openstacklocal:10000/
Connected to: Apache Hive (version 1.2.1000.2.6.1.0-129)
Driver: Hive JDBC (version 1.2.1000.2.6.1.0-129)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1000.2.6.1.0-129 by Apache Hive
0: jdbc:hive2://dkhdp261c6.openstacklocal:100> show databases;
+----------------+--+
| database_name |
+----------------+--+
| default |
+----------------+--+
1 row selected (0.29 seconds)
0: jdbc:hive2://dkhdp261c6.openstacklocal:100>
... View more
06-18-2017
05:52 AM
1 Kudo
Hi @suyash soni Unfortunately, this feature has not yet been implemented. This will be available in Zeppelin 0.8 based on https://issues.apache.org/jira/browse/ZEPPELIN-2368.
... View more
06-13-2017
08:11 AM
Hi @Jayadeep Jayaraman That is great - thanks for letting me know
... View more
06-13-2017
05:59 AM
@Jayadeep Jayaraman It is good to hear the sample works. I have a feeling that problem may be with the way you created your original table. Hence, try another thing - point your code to the test_orc_t_string table - the one from my above sample. Check if that works.
... View more
06-13-2017
05:32 AM
Hi @Jayadeep Jayaraman I have just done another test - treated timestamp as a string. That works for me as well. See below: beeline > create table test_orc_t_string (b string,t timestamp) stored as ORC;
> insert into table test_orc_t_string values('a', '1969-06-19 06:57:26.485'),('b','1988-06-21 05:36:22.35');
> select * from test_orc_t_string;
+----------------------+--------------------------+--+
| test_orc_t_string.b | test_orc_t_string.t |
+----------------------+--------------------------+--+
| a | 1969-06-19 06:57:26.485 |
| b | 1988-06-21 05:36:22.35 |
+----------------------+--------------------------+--+
2 rows selected (0.128 seconds)
pyspark >>> sqlContext.sql("select * from test_orc_t_string").show()
+---+--------------------+
| b| t|
+---+--------------------+
| a|1969-06-19 06:57:...|
| b|1988-06-21 05:36:...|
+---+--------------------+
Can you test the above at your site? Let me know how this works. Can you also send me the output of the below from beeline: show create table test;
... View more
06-13-2017
05:15 AM
Hi @Jayadeep Jayaraman I have just tested the same in pyspark2.1. That works fine my site. See below: beeline 0: jdbc:hive2://dkhdp262.openstacklocal:2181,> create table test_orc (b string,t timestamp) stored as ORC;
0: jdbc:hive2://dkhdp262.openstacklocal:2181,> select * from test_orc;
+-------------+------------------------+--+
| test_orc.b | test_orc.t |
+-------------+------------------------+--+
| a | 2017-06-13 05:02:23.0 |
| b | 2017-06-13 05:02:23.0 |
| c | 2017-06-13 05:02:23.0 |
| d | 2017-06-13 05:02:23.0 |
| e | 2017-06-13 05:02:23.0 |
| f | 2017-06-13 05:02:23.0 |
| g | 2017-06-13 05:02:23.0 |
| h | 2017-06-13 05:02:23.0 |
| i | 2017-06-13 05:02:23.0 |
| j | 2017-06-13 05:02:23.0 |
+-------------+------------------------+--+
10 rows selected (0.091 seconds)
pyspark [root@dkhdp262 ~]# export SPARK_MAJOR_VERSION=2
[root@dkhdp262 ~]# pyspark
SPARK_MAJOR_VERSION is set to 2, using Spark2
Python 2.7.5 (default, Jun 17 2014, 18:11:42)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 2.1.1.2.6.1.0-129
/_/
Using Python version 2.7.5 (default, Jun 17 2014 18:11:42)
SparkSession available as 'spark'.
>>> sqlContext.sql("select b, t from test_orc").show()
+---+--------------------+
| b| t|
+---+--------------------+
| a|2017-06-13 05:02:...|
| b|2017-06-13 05:02:...|
| c|2017-06-13 05:02:...|
| d|2017-06-13 05:02:...|
| e|2017-06-13 05:02:...|
| f|2017-06-13 05:02:...|
| g|2017-06-13 05:02:...|
| h|2017-06-13 05:02:...|
| i|2017-06-13 05:02:...|
| j|2017-06-13 05:02:...|
+---+--------------------+
Based on the error you have - is the timestamp value in your table a REAL timestamp? How did you insert it?
... View more
06-05-2017
01:13 PM
2 Kudos
ENVIRONMENT
HDP-2.6.0.3 Ambari 2.5.0.3
SOLUTION 1. Install R on each DN $ yum install R-devel libcurl-devel openssl-devel 2. Run on each DN $ R
> install.packages("knitr")
3. Test R from CLI [root@dghdp255 ~]# R -e "print(1+1)"
R version 3.3.3 (2017-03-06) -- "Another Canoe"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> print(1+1)
[1] 2
>
>
[root@dghdp255 ~]#
4. Zeppelin UI a) spark2 config SPARK_HOME /usr/hdp/current/spark2-client/
args
master yarn-client
spark.app.name Zeppelin
spark.cores.max
spark.executor.memory
spark.yarn.keytab /etc/security/keytabs/zeppelin.server.kerberos.keytab
spark.yarn.principal zeppelin-emeasupport@HWX.COM
zeppelin.R.cmd R
zeppelin.R.image.width 100%
zeppelin.R.knitr true
zeppelin.R.render.options out.format = 'html', comment = NA, echo = FALSE, results = 'asis', message = F, warning = F
zeppelin.dep.additionalRemoteRepository spark-packages,http://dl.bintray.com/spark-packages/maven,false;
zeppelin.dep.localrepo local-repo
zeppelin.interpreter.localRepo /usr/hdp/current/zeppelin-server/local-repo/2CHXWU7YZ
zeppelin.pyspark.python python
zeppelin.spark.concurrentSQL false
zeppelin.spark.importImplicit true
zeppelin.spark.maxResult 1000
zeppelin.spark.printREPLOutput true
zeppelin.spark.sql.stacktrace false
zeppelin.spark.useHiveContext true
b) test R from zeppelin UI c) create a test CSV file on the OS (zeppelin node) [root@dghdp254 ~]# ls -lrt /tmp/updated.csv
-rw-r--r--. 1 root root 1326 Jun 6 07:07 /tmp/test.csv
d) check reading the file from R CLI [root@dghdp254 ~]# R
R version 3.3.3 (2017-03-06) -- "Another Canoe"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> a<-read.csv("/tmp/test.csv")
> print(a)
[1] Test.File
<0 rows> (or 0-length row.names)
>
e) restart spark2 interpreter and run the below %spark2.r
a<-read.csv("/tmp/test.csv")
print(a)
... View more
Labels: