Member since
04-22-2016
931
Posts
46
Kudos Received
26
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1498 | 10-11-2018 01:38 AM | |
1866 | 09-26-2018 02:24 AM | |
1825 | 06-29-2018 02:35 PM | |
2416 | 06-29-2018 02:34 PM | |
5361 | 06-20-2018 04:30 PM |
09-29-2016
03:32 PM
also if you look into the message its saying "SPARK_MAJOR_VERSION is set to 2 "
... View more
09-29-2016
03:30 PM
hi lgeorge if you look at my message I am showing the variable set , so this is not the issue
... View more
09-29-2016
02:59 PM
I upgraded my HDP2.4 to HDP2.5 and it installed spark2 successfully and I see it also as green and no errors in the HDP console but when on command line I see the version its showing me as 1.6.2 still ? [root@hadoop5 ~]# spark-shell --version
SPARK_MAJOR_VERSION is set to 2, using Spark2
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.6.2
/_/ Type --help for more information.
[root@hadoop5 ~]# echo $SPARK_MAJOR_VERSION
2
... View more
Labels:
09-28-2016
02:29 PM
I am using HDP 2.4 , and spark 1.6.2 . I need to upgrade my spark to v2.0 , can it be done in HDP2.4 ? if not then can I install two releases of spark together on the same machine?
... View more
Labels:
09-27-2016
07:01 PM
I found two solutions to the similar problem I am facing on web , do they apply to my code ? 1- Import implicits: Note that this should be done only after an instance of org.apache.spark.sql.SQLContext is created. It should be written as: val sqlContext= new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._ 2- Move case class outside of the method: case class, by use of which you define the schema of the DataFrame, should be defined outside of the method needing it. You can read more about it here: https://issues.scala-lang.org/browse/SI-6649
... View more
09-27-2016
04:05 PM
I added this line but still getting the same error ? [info] Compiling 4 Scala sources to /root/weblogs/target/scala-2.11/classes...
[error] /root/weblogs/src/main/scala/LogSQL.scala:60: value createOrReplaceTempView is not a member of org.apache.spark.sql.DataFrame
[error] requestsDataFrame.createOrReplaceTempView("requests")
[error] ^
[error] one error found
[error] (compile:compile) Compilation failed
[error] Total time: 14 s, completed Sep 27, 2016 12:04:22 PM
... View more
09-23-2016
10:02 PM
I am getting the following error during compilation, also below are the build.sbt file and the source code. : error [info] Done updating.
[info] Compiling 4 Scala sources to /root/weblogs/target/scala-2.11/classes...
[error] /root/weblogs/src/main/scala/LogSQL.scala:60: value createOrReplaceTempView is not a member of org.apache.spark.sql.DataFrame
[error] requestsDataFrame.createOrReplaceTempView("requests")
[error] ^
[error] one error found
[error] (compile:compile) Compilation failed
scala code
[root@hadoop1 scala]# more LogSQL.scala
import org.apache.spark.SparkConf
import org.apache.spark.streaming.{Seconds, StreamingContext, Time}
import org.apache.spark.storage.StorageLevel
import org.apache.spark.sql.SQLContext
import org.apache.spark.rdd.RDD
import org.apache.spark.SparkContext
import java.util.regex.Pattern
import java.util.regex.Matcher
import Utilities._
/** Illustrates using SparkSQL with Spark Streaming, to issue queries on
* Apache log data extracted from a stream on port 9999.
*/
object LogSQL {
def main(args: Array[String]) {
// Create the context with a 1 second batch size
val ssc = new StreamingContext("local[*]", "LogSQL", Seconds(1))
setupLogging()
// Construct a regular expression (regex) to extract fields from raw Apache log lines
val pattern = apacheLogPattern()
// Create a socket stream to read log data published via netcat on port 9999 locally
val lines = ssc.socketTextStream("127.0.0.1", 9998, StorageLevel.MEMORY_AND_DISK_SER)
// Extract the (URL, status, user agent) we want from each log line
val requests = lines.map(x => {
val matcher:Matcher = pattern.matcher(x)
if (matcher.matches()) {
val request = matcher.group(5)
val requestFields = request.toString().split(" ")
val url = util.Try(requestFields(1)) getOrElse "[error]"
(url, matcher.group(6).toInt, matcher.group(9))
} else {
("error", 0, "error")
}
})
// Process each RDD from each batch as it comes in
requests.foreachRDD((rdd, time) => {
// So we'll demonstrate using SparkSQL in order to query each RDD
// using SQL queries.
// Get the singleton instance of SQLContext
val sqlContext = SQLContextSingleton.getInstance(rdd.sparkContext)
import sqlContext.implicits._
// SparkSQL can automatically create DataFrames from Scala "case classes".
// We created the Record case class for this purpose.
// So we'll convert each RDD of tuple data into an RDD of "Record"
// objects, which in turn we can convert to a DataFrame using toDF()
val requestsDataFrame = rdd.map(w => Record(w._1, w._2, w._3)).toDF()
// Create a SQL table from this DataFrame
requestsDataFrame.createOrReplaceTempView("requests")
// Count up occurrences of each user agent in this RDD and print the results.
// The powerful thing is that you can do any SQL you want here!
// But remember it's only querying the data in this RDD, from this batch.
val wordCountsDataFrame =
sqlContext.sql("select agent, count(*) as total from requests group by agent")
println(s"========= $time =========")
wordCountsDataFrame.show()
// If you want to dump data into an external database instead, check out the
// org.apache.spark.sql.DataFrameWriter class! It can write dataframes via
// jdbc and many other formats! You can use the "append" save mode to keep
// adding data from each batch.
})
// Kick it off
ssc.checkpoint("C:/checkpoint/")
ssc.start()
ssc.awaitTermination()
}
}
/** Case class for converting RDD to DataFrame */
case class Record(url: String, status: Int, agent: String)
/** Lazily instantiated singleton instance of SQLContext
* (Straight from included examples in Spark) */
object SQLContextSingleton {
@transient private var instance: SQLContext = _
def getInstance(sparkContext: SparkContext): SQLContext = {
if (instance == null) {
instance = new SQLContext(sparkContext)
}
instance
}
}
build.sbt [root@hadoop1 weblogs]# more build.sbt
name := "weblogs"
version := "1.0"
scalaVersion := "2.11.6"
resolvers ++= Seq(
"Apache HBase" at "http://repository.apache.org/content/repositories/releases",
"Typesafe repository" at "http://repo.typesafe.com/typesafe/releases/"
)
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "1.6.2",
"org.apache.spark" %% "spark-streaming" % "1.6.2",
"org.apache.spark" %% "spark-sql" % "1.6.2",
"org.apache.spark" %% "spark-mllib" % "1.6.2"
)
... View more
Labels:
- Labels:
-
Apache Spark
09-19-2016
08:50 PM
I am getting this error no matter if I run my scala code via SBT or via spark-submit . I am on Scala 2.11.6 and spark version 1.6.2 how can I fix this error ?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Spark
09-15-2016
07:51 PM
1 Kudo
this build.sbt fix the issue and now it compiles the package fine [root@hadoop1 TwitterPopularTags]# more build.sbt
name := "TwitterPopularTags" version := "1.0" scalaVersion := "2.11.8" val sparkVersion = "1.6.1" libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.spark" %% "spark-streaming-twitter" % sparkVersion
) resolvers += "Akka Repository" at "http://repo.akka.io/releases/" [root@hadoop1 TwitterPopularTags]#
... View more