Created on 07-21-2016 09:40 PM - edited 09-16-2022 03:30 AM
My intention is to pull data from webserver to HDFS,i tried with flume but data is not getting pushed to HDFS working hence i wrote simple below scala program in CDH 5.5.
Please note i checked spark service is up at Cloudera managaer console
Here is the program i tried running at scala console
scala> import org.apache.spark.SparkContext
import org.apache.spark.SparkContext
scala>
scala> val dataRDD = sc.textFile("http://10.3.9.34:9900/messages")
<console>:14: error: not found: value sc
val dataRDD = sc.textFile("http://10.3.9.34:9900/messages")
^
scala> dataRDD.collect().foreach(println)
<console>:15: error: not found: value dataRDD
dataRDD.collect().foreach(println)
^
scala>
scala> dataRDD.count()
<console>:15: error: not found: value dataRDD
dataRDD.count()
^
scala>
scala> import org.apache.spark.SparkContext
import org.apache.spark.SparkContext
scala> val dataRDD = sc.textFile("http://10.3.9.34:9900/messages")
<console>:16: error: not found: value sc
val dataRDD = sc.textFile("http://10.3.9.34:9900/messages")
Exact error is :-
16/07/21 23:35:35 ERROR SparkContext: Error initializing SparkContext.
org.apache.hadoop.security.AccessControlException: Permission denied: user=cloudera, access=WRITE, inode="/user/spark/applicationHistory":spark:supergroup:drwxr-xr-x
Created 07-21-2016 11:44 PM
thanks this got solved by this pos
Created 07-21-2016 11:44 PM
thanks this got solved by this pos
Created 07-22-2016 12:49 AM
Here you just ran the scala shell. You have to use the spark-shell to use Spark.
Created 07-23-2016 01:41 PM
i typed spark-shell and i got scala console