Member since
06-09-2016
529
Posts
129
Kudos Received
104
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1402 | 09-11-2019 10:19 AM | |
8436 | 11-26-2018 07:04 PM | |
1990 | 11-14-2018 12:10 PM | |
4129 | 11-14-2018 12:09 PM | |
2692 | 11-12-2018 01:19 PM |
08-24-2018
01:45 PM
@Manikandan Jeyabal The problem perhaps could be at the project level then. Could you check your pom file and make sure you have all necessary spark depenedencies. I tried this in zeppelin ui and is working fine: Also make sure you clean/build and perhaps exit eclipse just in case there is something wrong with eclipse. Finally here is a link on how to setup depenedecies eclipse: https://community.hortonworks.com/articles/147787/how-to-setup-hortonworks-repository-for-spark-on-e.html HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
08-24-2018
12:32 PM
@Manikandan Jeyabal
Perhaps you can try this out:
import org.apache.spark.sql.Encoders
case class Airlines(Airline_id: Integer, Name: String, Alias: String, IATA: String, ICAO: String, Callsign: String, Country: String, Active: String)
Encoders.product[Airlines].schema
Also there are some examples of use of case class in the following example:
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/sql/SparkSQLExample.scala
Let me know if this helps!
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
08-23-2018
12:17 PM
@Quan Pham If cluster is not secured / kerberized and you have not configured any alternative authentication method like ldap then this is exactly what you will experience. You should consider securing your cluster by using kerberos authentication. You can enable kerberos using ambari. HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
08-22-2018
11:06 AM
1 Kudo
@Guozhen
Li In yarn client mode the client machine - Windows machine needs to have network access to any of the cluster worker nodes (on any of the executors and AM could potentially run) and vise versa, the executors should be able to connect to the driver running on the windows client machine - I think you are right that this may be due firewall or network problem. HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
08-21-2018
12:03 PM
@Sudharsan
Ganeshkumar If the above has helped, please take a moment to login and click the "accept" link on the answer.
... View more
08-21-2018
12:00 PM
@Sundar Gampa If the above helped please remember to login and click the "accept" link on the answer.
... View more
08-17-2018
01:47 PM
@Sundar Gampa That path looks like the spark container working directory. Am I correct? This is taken from yarn configuration property yarn.nodemanager.local-dirs Out of the box spark provides ways to copy data to this directory by using --files --jar --archive arguments when running the spark-submit command. You can read more about those here: https://spark.apache.org/docs/latest/running-on-yarn.html Having that said if you like to add the directory resdata you simply need to zip the files you like to be part of the directory and add the zip file as spark-submit ... --files resdata.zip#resdata ... HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
08-17-2018
01:35 PM
@Xiong Duan Based on the error stack I think it could be due missing configuration. It would be helpful if you share your workflow file and properties file. Have you tried using shell action instead ?
... View more
08-17-2018
12:20 PM
@Narendra
Dev
When you defined your collection fields, did you added the path as a filed? Or you are using dynamic fields? As an example on your core managed-schema you should have a filed like this: <field name="path" type="string" indexed="true" stored="true" required="true" multiValued="false" /> If you are using dynamic fields make sure the field type is stored otherwise it wont be retrieved when you search. And last but not least you should make sure you are passing the path as part of the json/xml that is being put to solr. HTH
... View more
08-17-2018
12:04 PM
@Sudharsan
Ganeshkumar
AFAIK the csv format is not compatible between spark sql and hive serde and hence the error you are getting. A solution to this problem would be to: 1. create an external table pointing to the path where you will save the csv file 2. save the csv file instead of using saveAsTable function spark.sql("CREATE EXTERNAL TABLE Student_Spark2(col1 int,col2 string) STORED AS TEXTFILE LOCATION '/path/in/hdfs'")
//later save
rddstudent.write.format("csv").save("/path/in/hdfs/student_spark2") HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more