- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
spark + scala schema creattion error
- Labels:
-
Apache Spark
-
Schema Registry
Created on ‎08-24-2018 01:09 AM - edited ‎08-17-2019 06:26 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi All,
im trying to convert case class to StryctType schema in spark, im getting error attached in image,
please find my case class and conversion technique
case class Airlines(Airline_id: Integer, Name: String, Alias: String, IATA: String, ICAO: String, Callsign: String, Country: String, Active: String)
val AirlineSchema = ScalaReflection.schemaFor[Airlines].dataType.asInstanceOf[StructType]
Reference URL:
https://stackoverflow.com/questions/36746055/generate-a-spark-structtype-schema-from-a-case-class
Cheers,
MJ
Created ‎08-24-2018 12:32 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Perhaps you can try this out:
import org.apache.spark.sql.Encoders case class Airlines(Airline_id: Integer, Name: String, Alias: String, IATA: String, ICAO: String, Callsign: String, Country: String, Active: String) Encoders.product[Airlines].schema
Also there are some examples of use of case class in the following example:
Let me know if this helps!
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created ‎08-24-2018 12:32 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Perhaps you can try this out:
import org.apache.spark.sql.Encoders case class Airlines(Airline_id: Integer, Name: String, Alias: String, IATA: String, ICAO: String, Callsign: String, Country: String, Active: String) Encoders.product[Airlines].schema
Also there are some examples of use of case class in the following example:
Let me know if this helps!
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created on ‎08-24-2018 01:32 PM - edited ‎08-17-2019 06:26 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Felix Albani,
Tanx for your response, actully I've tried all the possible options,
please find the attached image for reference. is there any other way i can solve my issue ?
Cheers,
MJ
Created on ‎08-24-2018 01:45 PM - edited ‎08-17-2019 06:26 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The problem perhaps could be at the project level then. Could you check your pom file and make sure you have all necessary spark depenedencies. I tried this in zeppelin ui and is working fine:
Also make sure you clean/build and perhaps exit eclipse just in case there is something wrong with eclipse.
Finally here is a link on how to setup depenedecies eclipse:
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created ‎08-24-2018 05:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
just for an clarification this will work only on Hortonworks dependencies ? please find my build.sbt dependencies and let me know whether i needs to add anything
val sparkVersion = "2.2.1"val hadoopVersion = "2.7.1"val poiVersion = "3.9"val avroVersion = "1.7.6"
libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % sparkVersion, "org.apache.spark" %% "spark-sql" % sparkVersion, "org.apache.spark" %% "spark-hive" % sparkVersion, "org.apache.poi" % "poi-ooxml" % poiVersion, "org.apache.poi" % "poi" % poiVersion, "org.apache.avro" % "avro" % avroVersion )
Created ‎08-24-2018 05:51 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Felix Albani,
Still the same issue Exists, please find my attached build.sbt and sample code attached.
import sbt._ import sbt.Keys._ name := "BackupSnippets" version := "1.0" scalaVersion := "2.11.8" val sparkVersion = "2.2.1" val hadoopVersion = "2.7.1" val poiVersion = "3.9" val avroVersion = "1.7.6" val hortonworksVersion = "2.2.0.2.6.3.79-2" conflictManager := ConflictManager.latestRevision /*libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % sparkVersion, "org.apache.spark" %% "spark-sql" % sparkVersion, "org.apache.spark" %% "spark-hive" % sparkVersion, "org.apache.poi" % "poi-ooxml" % poiVersion, "org.apache.poi" % "poi" % poiVersion, "org.apache.avro" % "avro" % avroVersion )*/ /*resolvers ++= Seq( "Typesafe repository" at "https://repo.typesafe.com/typesafe/releases/", "Typesafe Ivyrepository" at "https://repo.typesafe.com/typesafe/ivy-releases/", "Maven Central" at "https://repo1.maven.org/maven2/", "Sonatype snapshots" at "https://oss.sonatype.org/content/repositories/snapshots/", Resolver.sonatypeRepo("releases") ) */ libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % hortonworksVersion, "org.apache.spark" %% "spark-sql" % hortonworksVersion, "org.apache.spark" %% "spark-hive" % hortonworksVersion) resolvers ++= Seq("Hortonworks Releases" at "http://repo.hortonworks.com/content/repositories/releases/", "Jetty Releases" at "http://repo.hortonworks.com/content/repositories/jetty-hadoop/")
************************************************************************************
package BigData101.ORC
import ScalaUtils.SchemaUtils
import org.apache.spark.sql.{Encoders, Row, SparkSession}
import org.apache.spark.sql.catalyst.ScalaReflection
import org.apache.spark.sql.types.StructType
object ORCTesting {
def main(args: Array[String]): Unit = {
val spark: SparkSession = SparkSession.builder()
.master("local[*]")
.appName("ORC Testing")
.enableHiveSupport()
.getOrCreate()
case class Airlines(Airline_id: Integer, Name: String, Alias: String, IATA: String, ICAO: String, Callsign: String,
Country: String, Active: String)
//val AirlineSchema = ScalaReflection.schemaFor[Airlines].dataType.asInstanceOf[StructType]
Encoders.product[Airlines].schema
sys.exit(1)
}
}
Created ‎08-24-2018 06:21 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think there may be a project level problem on yours. Therefore I'm sharing a very simple dummy project that is compiling fine in my environment (this is only shared as an example and is not meant to be used in real production):
Hopefully with this project you will be able to figure out what is wrong when comparing with yours. Note: You can use command line sbt compile and sbt package to get the jar
This is the output I get when I run it:
SPARK_MAJOR_VERSION=2 spark-submit /root/projects/sparktest/target/scala-2.11/hello_2.11-0.1.0-SNAPSHOT.jar Hello SPARK_MAJOR_VERSION is set to 2, using Spark2 hello root |-- Airline_id: integer (nullable = true) |-- Name: string (nullable = true) |-- Alias: string (nullable = true) |-- IATA: string (nullable = true) |-- ICAO: string (nullable = true) |-- Callsign: string (nullable = true) |-- Country: string (nullable = true) |-- Active: string (nullable = true)
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created ‎08-25-2018 06:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yes your suggestion works fine, i think i have to extend my object as app to make it work. Tanx Bro.
