Member since
06-09-2016
529
Posts
129
Kudos Received
104
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1733 | 09-11-2019 10:19 AM | |
| 9325 | 11-26-2018 07:04 PM | |
| 2486 | 11-14-2018 12:10 PM | |
| 5320 | 11-14-2018 12:09 PM | |
| 3145 | 11-12-2018 01:19 PM |
05-16-2018
07:18 PM
@David Sandoval What version of HDP are you running this with? I believe the missing class was added starting HDP 2.6.1 only. I also noticed you are using spark 2.1 with scala 2.10 - Spark 2.1.0 uses Scala 2.11, so you should change this as well. HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
05-16-2018
06:09 PM
@Clément Dumont - If I'm correct those errors are showing on the ambari startup operations for the HDFS and HIVE. This means that ambari is trying to reach the ranger admin ui and is failing to communicate for whatever reason. Ambari will use the following configuration settings for the url: for HDFS: ranger.plugin.hdfs.policy.rest.url and for HIVE: ranger.plugin.hive.policy.rest.url I suggest you check on HDFS > Configs if the ranger.plugin.hdfs.policy.rest.url is correctly pointing to the ranger ui url Same for HIVE > Configs if ranger.plugin.hive.policy.rest.url is correctlly pointing to the ranger ui url HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
05-15-2018
09:17 PM
@Abdul Rahim it seems the problem you have now is different. Looks like the input data is not being split on comma. Make sure the map _.split(",") is working because it seems not to be working for you now. Also please mark the other answer I provided as it solved the original parsing issue you had.
... View more
05-15-2018
02:24 PM
1 Kudo
@Abdul Rahim The error is caused due you are parsing a string that contains a double into a long. Instead you should parse it into a double The following code works fine for me: case class Person(index:Long,item:String,cost:Double,Tax:Double,Total:Double)
val peopleDs = sc.textFile("hdpcd/Samplecsv").map(_.split(",").map(_.trim)).map(attributes=> Person(attributes(0).toLong,attributes(1).toString,attributes(2).toDouble,attributes(3).toDouble,attributes(4).toDouble)).toDF()
peopleDs.createOrReplaceTempView("people")
val res = spark.sql("Select * from people")
res.collect()
Results: defined class Person
peopleDs: org.apache.spark.sql.DataFrame = [index: bigint, item: string ... 3 more fields]
res: org.apache.spark.sql.DataFrame = [index: bigint, item: string ... 3 more fields]
res24: Array[org.apache.spark.sql.Row] = Array([1,Fruit of the Loom Girls Socks,7.97,0.6,8.57], [2,Rawlings Little League Baseball,2.97,0.22,3.19], [3,Secret Antiperspirant,1.29,0.1,1.39], [4,Deadpool DVD,14.96,1.12,16.08], [5,Maxwell House Coffee 28 oz,7.28,0.55,7.83], [6,Banana Boat Sunscreen,6.68,0.5,7.18], [7,Wrench Set,10.0,0.75,10.75], [8,M and Mz,8.98,0.67,9.65], [9,Bertoli Alfredo Sauce,2.12,0.16,2.28], [10,Large Paperclips,6.19,0.46,6.65]) Note: If you comment this post make sure you tag my name. And If you found this answer helped addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
05-08-2018
06:06 PM
@Bhushan Kandalkar above steps look good to me. Do you see any errors on hiveserver2.log?
... View more
05-08-2018
05:59 PM
@Khouloud Landari Do you see it stuck after those WARN Service SparkUI could not bind to port 4041? If that is the case I think the problem maybe is not able to start an application on yarn. What happens is spark2 pyspark launches a yarn application on your cluster and I think this is what is probably failing. Try this command and let me know if this works: SPARK_MAJOR_VERSION=2 pyspark --master local --verbose Also I would advise you to check the Resource Manager logs. RM logs can be found on RM host under /var/log/hadoop-yarn This will probably show what the problem is with yarn and why your zeppelin user is not able to start applications on the hadoop cluster. HTH
... View more
05-08-2018
02:19 PM
@Bhushan Kandalkar did you add the hive certificate to the knox host cacerts and restart Knox? This may help resolve the problem. #open console to knox host
# run the following command to locate the jdk used by knox
ps -ef | grep -i knox
# run the following command to import the hive certificate to the default cacerts truststore
keytool -import -file hive.crt -keystore /<knox_jdk_path>/jre/lib/security/cacerts
-storepass changeit -alias hive Note: if you add any comments to this post please make sure you tag my name. Also If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
05-08-2018
01:49 PM
@RAUI I did run it more than once. I edited the previous comment also mentioning the same. No errors even after several executions of same code. I'm using spark 2.2.0 - HDP 2.6.4 - Could you provide the full error stack? Also did you use specific location for the database? Are you running on master yarn or local?
... View more
05-08-2018
01:05 PM
@RAUI What version of HDP and Spark are you using? I tested the same using HDP 2.6.4 on zeppelin and is working fine with spark 2. I run the following code more than one time and it always ended with no errors: spark.sql("show databases").show
spark.sql("CREATE DATABASE IF NOT EXISTS abc LOCATION '/user/zeppelin/abc.db'")
+------------+
|databaseName|
+------------+
| abc|
| default|
+------------+
res27: org.apache.spark.sql.DataFrame = [] Please provide full error stack and details of spark/hdp version you are using. Note: Please tag my name if you provide a comment to this post using my name and symbol @
... View more
05-08-2018
12:52 PM
1 Kudo
@Khouloud Landari The error message is very generic. To be able to further help you finding the solution please provide:
1. Check /var/log/zeppelin/zeppelin-interpreter-spark2-spark-zeppelin-*.log - And copy the important pieces you consider worth sharing
2. From zeppelin UI > Interpreter > Take screenshot of spark2 interpreter configuration and share - Also try restarting the interpreter and check if that helps or not.
3. Run the following command from zeppelin host: SPARK_MAJOR_VERSION=2 pyspark --master yarn --verbose
and copy the output you get in the console to this post
With this information we should be able to draw further conclusions as to what could be causing the issue. Note: If you add a comment to this post please make sure you tag my name using @ and my name. This way I will be able to know you have updated with more information.
... View more